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(54) Stress proteins 

(57) Described is a stress protein named ORP150, 
polynucleotides encoding said protein as well as anti- 
bodies against the ORP150 protein. Furthermore, phar- 
maceutical compositions comprising these proteins, 
polynucleotides or antibodies are described and their 
use for the treatment of ischemic diseases. 



CNJ 
< 

CM 
h- 
*T 

O 

CO 

o 
Q. 

LU 



EP 0 780 472 A2 

Description 

The present invention relates to an oxygen-regulated protein 150 (ORP150). Specifically, the invention relates to 
the amino acid sequence of such ORP150 polypeptides, polynucleotides encoding ORP150 polypeptides, promoters 
5 of ORP150 genes and antibodies specific to ORP1 50 polypeptides. 

Since the expression of a 70 kDa heat shock protein (HPS70) in cerebral ischemic lesions was reported for the first 
time, various stress proteins, represented by HSP70, have been reported to be expressed in myocardial ischemic and 
atherosclerotic lesions, as well as cerebral ischemic lesions. The fact that the induction of HSP, a mechanism of defence 
against heat stress, is seen in tschemic lesions, suggests that the stress response of the body to ischemic hypoxia is 
ic an active phenomenon involving protein neogenesis. Regarding cultured cells, stressful situations that cause ischemia 
in vivo, such as hypoglycemia and hypoxia, have been shown to induce a group of non-HSP stress proteins, such as 
glucose-regulated protein (GRP) and oxygen -regulated protein (ORP). 

ORP is therefore expected to serve in the diagnosis and treatment of ischemic diseases. 

Hori et at. have recently found that exposure of cultured rat astrocytes to hypoxic conditions induces 150, 94, 78, 
is 33 and 28 kDa proteins [J. Neurochem., 66, 973-979(1 996)]. These proteins, other than the 1 50 kDa protein, were iden- 
tified as GRP94, GRP78, hemoxygenase 1 and HSP28, respectively, while the 150 kDa protein (rat ORP 150) remains 
not to be identified. In addition, there has been no report of human ORP 150 protein. 

Accordingly, the technical problem underlying the present invention is to provide ORP 150 proteins, namely those 
of human and rat origin, the amino acid sequences of these proteins as well as nucleotide sequences encoding these 
20 proteins, the promoter regions of the corresponding genes and antibodies against ORP 150 proteins or fragments 
thereof which are useful in the diagnosis and treatment of ischemic diseases. 

This technical problem has been solved by the provision of the embodiments characterized in the claims. 

Thus, in a first aspect, the present invention relates to a polynucleotide encoding an ORP 150 polypeptide selected 
from the group consisting of: 

25 

(a) polynucleotides encoding the polypeptide having the amino acid sequence as depicted in SEQ ID NO:1 or a 
fragment of the polypeptide; 

(b) polynucleotides comprising the coding region of the nucleotide sequence as shown in SEQ ID NO:2 or a frag- 
ment thereof; 

so (c) polynucleotides encoding the polypeptide having the amino acid sequence as depicted in SEQ ID NO :3 or a 
fragment of the polypeptide, 

(d) polynucleotides comprising the coding region of the nucleotide sequence as depicted in SEQ ID NO:4 or a frag- 
ment thereof; 

(e) polynucleotides encoding an ORP 150 polypeptide which differs from the polypeptide encoded by the polynucle- 
35 otide of (a) or (c) due to deletion(s), addition(s), insertion(s) and/or substitutions (s) of one or more amino acid res- 
idues; and 

(f) polynucleotides the complementary strand of which hybridizes to a polynucleotide of any one of (a) to (e) and 
which encode an ORP150 polypeptide; 

40 and the complementary strand of such a polynucleotide. 

In still another embodiment, the present invention relates to a polynucleotide capable of hybridizing to the above 
polynucleotide or a fragment thereof and having promoter activity. 

In still another embodiment, the present invention relates to a recombinant DNA, e.g. vectors, which contains a 
nucleotide sequence of the present invention. 
45 In still another embodiment, the present invention relates to an expression vector which contains the recombinant 

DNA of the present invention, to host cells transformed with polynucleotides or vectors of the invention and to a process 
for the production of an ORP 150 protein by cultivating such host cells. In a further embodiment, the present invention 
relates to the polypeptides encoded by the polynucleotides of the invention. 

In still another embodiment, the present invention relates to an antibody or fragment thereof which specifically 
so binds to the polypeptide of the present invention, and to nucleic acid molecules which specifically hybridize to polynu- 
cleotides of the present invention. 

In still another embodiment the present invention relates to pharmaceutical and diagnostic compositions compris- 
ing the above-described polynucleotides, polypeptides, antibodies and/or nucleic acid molecules. 

Figure 1 indicates a schematic diagram of the exon-intron structure of the human ORP gene. Black squares repre- 
ff sent the exons. 

Figure 2 shows the results of the Northern blot analysis of ORP150 mRNA extracted from human astrocytoma 
U373 cells after exposure to various types of stress. 

Figure 3 shows the results of the Northern blot analysis of ORP150 mRNA from adult human tissues 

One embodiment of a polynucleotide of the present invention is a polynucleotide encoding a polypeptide compris- 
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ing the amino acid sequence shown by SEQ ID NO:1 in the sequence listing, and constituting the human cxygen-reg- 
ulated protein ORP150 which is obtainable by inducement under hypoxic conditions. Another embodiment of a 
polynucleotide of the present invention is a polynucleotide encoding a polypeptide comprising the amino acid sequence 
shown by SEQ ID NO: 3 in the sequence listing, and constituting the rat oxygen -regulated protein ORP150 which is 
5 obtainable by inducement under hypoxic conditions. The polynucleotides of the present invention also include those 
which code for polypeptides each comprising a portion of the above-described polypeptides, and those encoding the 
entire or portion of the above-described polypeptides. It is a well-known fact that mutation occurs in nature; some of the 
amino acids of ORP150 protein may be replaced or deleted, and other amino acids may be added or inserted. Mutation 
can also be induced by gene engineering technology. It is therefore to be understood that substantially homologous 
ic polypeptides resulting from such mutations in one or more amino acid residues are also included in the scope of the 
present invention as long as they are obtainable by inducement under hypoxic conditions. 

Further embodiments of a polynucleotide of the present invention are polynucleotides comprising the nucleotide 
sequence shown by SEQ ID NO:2 in the sequence listing, i.e., human ORP150 cDNA and polynucleotides comprising 
the nucleotide sequence shown by SEQ ID NO:4 in the sequence listing which represents rat ORP150 cDNA. Polynu- 
75 cleotides comprising a portion of these polynucleotides, and those containing the entire or portion of these polynucle- 
otides are also included in the scope of the present invention. As stated above, the ORP150 gene may have some 
bases replaced, deleted, added or inserted by mutations, and the resulting polynucleotides with partially different nucle- 
otide sequences are also included in the scope of the present invention, as long as they are substantially homologous 
and encode a polypeptide obtainable by inducement under hypoxic conditions. 
20 The present invention also relates to a polynucleotide the complementary strand of which hybridizes to a polynu- 
cleotide as described above and which codes for an ORP1 50 polypeptide, this means for a polypeptide inducible under 
hypoxic conditions. "Hybridizing" in this regard means preferably hybridization under stringent conditions. The hybridiz- 
ing polynucleotides have preferably a sequence identity of at least 50% most preferably of at least 70%, with the poly- 
nucleotides described above. The term "stringent conditions" means that hybridization will occur only if there is at least 
25 95% and preferably at least 97% identity between the sequences. 

The polynucleotides of the present invention may be RNA or DNA molecules. DNA molecules can, for example, be 
cDNA, genomic DNA, double or single stranded DNA, isolated from natural sources, produced in vitro or by chemical 
synthesis methods. The polynucleotides of the invention can code for an ORP150 polypeptide from any organism 
expressing such a polypeptide, preferably from eukaryots, for example, insects, vertebrates, preferably mammals and 
3c most preferably from human, rat, mouse, bovine, sheep, goat or pig. 

Furthermore, the present invention also relates to recombinant nucleic acid molecules which comprise a polynu- 
cleotide according to the invention. Examples for such molecules are vectors, namely plasmids, cosmids, phagemids, 
recombinant phages, viruses etc. 

In a preferred embodiment the polynucleotide according to the invention present in such a recombinant nucleic acid 
35 molecule is linked to regulatory elements which allow for expression in prokaryotic or eukaryotic host cells. Such regu- 
latory elements are well known in the art and include promoters, transcriptional and translational enhancers and the 
like. 

The term "recombinant DNA" as used herein is defined as any DNA containing a polynucleotide described above. 

The term "expression vector" as used herein is defined as any vector containing the recombinant DNA of the 
40 present invention and expressing a desired protein by introduction into the appropriate host. 

The term "clone" as used herein means not only a cell into which a polynucleotide of interest has been introduced 
but also the polynucleotide of interest itself. 

The term "inducement under hypoxic conditions" used herein means an increase in protein synthesis upon expos- 
ing cells to an oxygen -depleted atmosphere. 
45 The present invention furthermore relates to host cells transformed and genetically engineered with a polynucle- 
otide according to the invention. These may be prokaryotic or eukaryotic ells. They may be homologous or heterologous 
with respect to the introduced polynucleotide. If they are homologous they can be distinguished from naturally occurring 
cells by the feature that they comprise in addition to a naturally occurring ORP1 50 gene, at least one further copy of an 
ORP150 coding region which is integrated into the genome in a position in which it does normally not occur. This can 
5c be confirmed, e.g., by Southern blotting. Suitable host cells include, for example, bacteria such as E. coli and Bacillus 
subtilis, yeast such as S. cerevisiae, vertebrate cells, insect cells, mammalian cells, e.g. rat, mouse or human cells. 

Moreover, the present invention relates to a process for the production of an ORP150 polypeptide which comprises 
the steps of culturing the host according to the invention and recovering the produced polypeptide from the cells and/or 
the culture medium. 

55 The present invention also relates to the polypeptides encoded by the polynucleotides according to the invention 
or obtainable by the above described process. 

The amino acid sequences and nucleotide sequences of the present invention can, for example, be determined as 
follows: First, poly(A)^ RNA is prepared from rat astrocytes exposed to hypoxic conditions. After cDNA is synthesized 
from said poly(A) + RNA using random hexamer primers, a cDNA library is prepared using the pSPORTI vector (prc- 
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duced by Life Technology), or the like. 

Next, PCR is conducted using oligonucleotide primers synthesized on the basis of the nucleotide sequence of the 
pSPORTI vector used to prepare the cDNA library above and the degenerate nucleotide sequences deduced from the 
N-terminal amino acid sequence of purified rat ORP150, to yield a large number of amplified DNA fragments. These 
5 DNA fragments are then inserted into the pT7 Blue vector (produced by Novagen), or the like, for cloning to obtain a 
clone having nucleotide sequence which perfectly encodes the N-terminal amino acid sequence. Purification of 
ORP150 can be achieved by commonly used methods of protein purification, such as column chromatography and 
electrophoresis, in combination as appropriate. 

In addition, by screening the above-described rat astrocyte cDNA library by colony hybridization using the insert in 
io above clone as a probe, a clone having an insert thought to encode rat ORP150 can be obtained. This clone is sub- 
jected to stepwise deletion from both the 5'- and 3'-ends, and oligonucleotide primers prepared from determined nucle- 
otide sequences are used to determine the nucleotide sequence sequentially. If the clone thus obtained does not 
encode the full length of rat ORP150, an oligonucleotide probe is synthesized on the basis of the nucleotide sequence 
of the 5'- or 3'-region of the insert, followed by screening for a clone containing the nucleotide sequence extended fur- 
75 ther in the 5' or 3' direction, for example, the Gene Trapper cDNA Positive Selection System Kit (produced by Life Tech- 
nology) based on hybridization using magnetic beads. The full-length cDNA of the rat ORP150 gene is thus obtained. 

Separately, the following procedure is followed to obtain a human homologue of rat ORP150 cDNA. Poiy(A) + RNA 
is prepared from the human astrocytoma U373 exposed to hypoxic conditions. After cDNA is synthesized from said 
poly(A)*RNA using random hexamer primers and an oligo(dT) primer, said cDNA is inserted into the EcoRI site of the 
20 pSPORTI vector to prepare a cDNA library. Human ORP150 cDNA is then obtained using the Gene Trapper Kit and 
the nucleotide sequence is determined in the same manner as with rat ORP150 above. 

The nucleotide sequence of human ORP150 cDNA is thus determined as that shown by SEQ ID NO:2 in the 
sequence listing, based on which the amino acid sequence of human ORP150 is determined. 

Exposure of astrocytes to hypoxic conditions can, for example, be achieved by the method of Ogawa et al. [Ogawa, 
25 S., Gerlach, H., Esposito, C, Mucaulay, A.P., Brett, J., and Stem, D.. J. Clin. Invest, 85, 1090-1098 (1990)]. 

Furthermore, the following procedure is followed to obtain human ORP150 genomic DNA. A genomic library pur- 
chased from Clontech (derived from human placenta, Cat. #HL1067J) is used. Screening is conducted by hybridization 
using a DNA fragment consisting of 202 bp of the 5" untranslated region and 369 bp of the coding region, derived from 
the rat cDNA clone, as well as a 1351 bp DNA fragment containing the termination codon, derived from the human 
30 cDNA, as probes. Two clones containing the ORP150 gene are isolated, one containing exons 1 through 24 and the 
other containing exons 16 through 26; the entire ORP150 gene is composed by combining these two clones. The nucle- 
otide sequence of the 15851 bp human ORP150 genomic DNA is determined; its nucleotide sequence from the 5'-end 
to just before the translation initiation codon ATG in exon 2 is shown by SEQ ID NO:12 in the sequence listing. 

As stated above, the present invention includes polypeptides containing the entire or portion of the polypeptide 
35 (human ORP150) having the amino acid sequence shown by SEQ ID NO:1 in the sequence listing. The present inven- 
tion also includes the entire or portion of the polypeptide having the amino acid sequence shown by SEQ ID NO:! in 
the sequence listing; for example, polynucleotides containing the entire or portion of the nucleotide sequence shown by 
SEQ ID NO:2 in the sequence listing are included in the scope of the present invention. The present invention also 
includes specific antibodies against these polypeptides of the present invention, and fragments thereof. 
40 An antibody against a polypeptide of the present invention, which polypeptide contains the entire or portion of 
human or rat ORP150, can be prepared by a conventional method [Current Protocols in Immunology, Coligan, J.E. et 
al. eds., 2.4.1-2.4.7, John Wiley & Sons, New York (1991)]. Specifically, a rat ORP150 band, separated by, for example, 
SDS-polyacrylamide gel electrophoresis, is cut out and given to a rabbit etc. for immunization, after which blood is col- 
lected from the immunized animal to obtain an antiserum. An IgG fraction can be obtained if necessary by affinity chro- 
45 matography using immobilized protein A, or the like. A peptide identical to the partial amino acid sequence of ORP150 
can be chemically synthesized as a multiple antigen peptide (MAP) [Tarn, J. P., Proc. Natl. Acad. Sci. USA, 85, 5409- 
5413 (1988)], and can be used for immunization in the same manner as above. 

It is also possible to prepare a monoclonal antibody by a conventional method [Cell & Tissue Culture; Laboratory 
Procedure (Doyle, A. et al., eds.) 25A:1-25C:4, John Wiley & Sons, New York (1994)] using a polypeptide containing 
so the entire or portion of human or rat ORP150 as an antigen. Specifically, a hybridoma is prepared by fusing mouse 
splenocytes immunized with said antigen and a myeloma cell line, and the resulting hybridoma is cultured or intraperi- 
toneally transplanted to the mouse to produce a monoclonal antibody 

The fragments resulting from protease digestion of these antibodies as purified can also be used as antibodies cf 
the present invention. 

55 The present invention also relates to nucleic acid molecules which specifically hybridize with a polynucleotide 
according to the invention or with the complementary strand of such a polynucleotide. "Specifically hybridizing" means 
that such molecules show no significant cross-hybridization to polynucleotides coding for proteins other than an 
ORP150 polypeptide. Preferably these nucleic acid molecules have a length of at least 15 nucleotides, more preferably 
of at least 30 nucleotides and most preferably of at least 50 nucleotides. In a preferred embodiment these molecules 
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have over their entire length a sequence identity to a corresponding region of a polynucleotide of the invention of at least 
85%, preferably of at least 90% and most preferably of at least 95%. In a particularly preferred embodiment the 
sequence identity is at least 97%. These nucleic acid molecules can be used, for example, as hybridization probes for 
the isolation of related genes, as PCR primers, for the diagnosis of mutations of ORP1 50 genes, for the use in antisense 

5 molecules or ribozymes or the like. 

The polynucleotides of the present invention, the polypeptides encoded by them, specific antibodies against these 
polypeptides or fragments thereof and the nucleic acid molecules specifically hybridizing to the above-mentioned poly- 
nucleotides are useful in the diagnosis and treatment of ischemic diseases, permitting utilization for the development of 
therapeutic drugs for ischemic diseases. 

io Thus, the present invention also relates to a pharmaceutical composition comprising a polynucleotide, polypeptide, 
antibody and/or nucleic acid molecule according to the invention. Optionally, such a composition also comprises a phar- 
maceutical^ acceptable carrier. 

The invention also relates to diagnostic composition comprising a polynucleotide, polypeptide, antibody and/or 
nucleic acid molecule according to the invention. 

is In another embodiment the present invention relates to a polynucleotide comprising or containing the entire or por- 
tion of the nucleotide sequence shown by SEQ ID NO:12 in the sequence listing. This is a polynucleotide containing the 
promoter region of the human ORP150 gene. Polynucleotides capable of hybridizing to this polynucleotide under con- 
ventional hybridizing conditions (e.g., in 0.1 x SSC containing 0.1% SDS at 65°C) and possessing promoter activity are 
also included in the scope of the present invention. Preferably, such a promoter is able to promote transcription in cells 

20 when exposed to hypoxia. Successful cloning of said promoter region would dramatically advance the functional anal- 
ysis of the human ORP150 gene and facilitate its application to the treatment of ischemic diseases. 

The term "promoter" as used herein is defined as a polynucleotide comprising a nucleotide sequence that activates 
or suppresses the transcription of a desired gene by being present upstream or downstream of said gene. 
The following examples illustrate the present invention 

25 

Ex a m p le 1 

Cell culture and achievement of hypoxic condition 

30 Rat primary astrocytes and microglia were obtained from neonatal rats by a modification of a previously described 
method [Maeda, Y, Matsumoto, M., Ohtsuki, T, Kuwabara, K., Ogawa, S., Hori, O., Shui. D.Y., Kinoshita, T, Kamada, 
T, and Stern, D., J. Exp. Med., 180, 2297-2308(1994)]. Briefly, cerebral hemispheres were harvested from neonatal 
Sprague-Dawley rats within 24 hours after birth, meninges were carefully removed, and brain tissue was digested at 
37°C in minimal essential medium (MEM) with Jokfik's modification (Gibco, Boston MA) containing Dispase II (3mg/ml; 

35 Boehringer-Mannheim, Germany). After centrifugation, the cell pellet was resuspended and grown in MEM supple- 
mented with fetal calf serum (FCS; 10%; CellGrow, MA). 

After 10 days, cytosine arabinofuranoside (lO^g/ml; Wako Chemicals, Osaka, Japan) was added for 48 hours to 
prevent fibroblast overgrowth, and culture flasks were agitated on a shaking platform. Then, floating cells were aspi- 
rated (these were microglia), and the adherent cell population was identified by morphological criteria and immunohis- 

40 tochemical staining with anti-glial fibrillary acidic protein antibody. Cultures used for experiments were >98% astrocytes 
based on these techniques. 

Human astrocytoma cell line U373 was obtained from American Type Culture Collection (ATCC) and cultured in 
Dulbecco's modified Eagle medium (produced by Life Technology) supplemented with 10% FCS. 

Cells were plated at a density of about 5 X 10 4 cells /cm 2 in the above medium. When cultures achieved conf'u- 
45 ence, they were exposed to hypoxia using an incubator attached to a hypoxia chamber which maintained a humidified 
atmosphere with low oxygen tension (Coy Laboratory Products, Ann Arbor Ml) as described previously [Ogawa, S., 
Geriach, K, Esposito, C, Macaulay, A.P., Brett, J., and Stern, D , J. Clin. Invest., 85, 1090-1098 (1990)]. 

Example 2 

50 

Purification and N-terminal sequencing of the rat 150 kDa polypeptide 

Rat primary astrocytes (about 5 x 10 s cells) exposed to hypoxia for 48 hours were harvested, cells were washec 
three times with PBS(pH 7.0) and protein was extracted with PBS containing NP-40 (1%), PMSF (1mM), and EDTA 
55 (5mM). Extracts were then filtered (0.45 \xm nitrocellulose membrane), and either subjected to reduced SDS-PAGE 
(7.5%, about 25^g) or 2-3 mg of protein was diluted with 50 mi of PBS (pH 7.0) containing NP-40(0.05%) and EDTA 
(5mM), and applied to FPLC Mono Q(bed volume 5 ml, Pharmacia, Sweden). 

The column was washed with 0,2M NaCI, eluted with an ascending salt gradient (0.2 to 1.8 M NaCI) and 10 ^i cf 
each fraction (0.5 ml) was applied to reduced SDS-PAGE (7.5%), along with molecular weight markers (Biorad). Pro- 
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teins in the gel were visualized by silver staining. Fractions eluted from FPLC Mono Q which contained the 150 kDa 
polypeptide (#7-8) were pooled and concentrated by ultrafiltration (Amicon) 50-fold and about 200 ^g of protein was 
applied to preparative, reduced SDS-PAGE (7.5%). Following electrophoresis, proteins in the gel were transferred elec- 
trophoretically (2A/cm 2 ) to polyvinylidene difluoride (PVDF) paper (Millipore, Tokyo), the paper was dried, stained with 
5 Coomassie Brilliant blue, and the band corresponding to 150 kDa protein (OP R 150) was cut out for N-terminal 
sequencing using an automated peptide sequencing system (Applied Biosystems, Perkin-Eimer). The N-terminal 31- 
amino acid sequence was thus determined (SEQ ID NO:5). 

Example 3 

ic 

Preparation of rat astrocyte cDNA library 

Total RNA was prepared from rat primary astrocytes (1.1 x 10 8 cells), in which ORP150 had been induced under 
hypoxic conditions, by the acid guanidinium-phenol-chloroform method [Chomczynski, P. and Sacchi, N., Anal. Bio- 
is chem., 162, 156-159 (1987)]. Using 300 \ig of the total RNA obtained, purification was conducted twice in accordance 
with the protocol for poly(A)* RNA purification using oligo(dT)-magnetic beads (produced by Perceptive Diagnostics), 
to yield poly(A) + RNA. Double -stranded cDNA was then synthesized using random hexamer primers, in accordance 
with the protocol for the Superscript Choice System (produced by Life Technology), and inserted into the EcoRI site of 
the pSPORTI vector to prepare a cDNA library consisting of 5.4 x 10 5 independent clones. 

20 

Example 4 

Cloning of rat ORP15Q cDNA 

25 Rat ORP150 cDNA was cloned as follows: First, to obtain a probe for colony hybridization, the cDNA library was 
subjected to PCR using a 20-base primer, 5 AATACG ACTCACTATAGGGA-3' (SEQ ID NO:6), which corresponds to the 
antisense strand of the T7 promoter region in the pSPORTI vector, and 20 base mixed primers, 5'-AARCCiGGiGT- 
NCCNATGGA-3' (SEQ ID NO:8), which contains inosine residues and degenerate polynucleotides and which was pre- 
pared on the basis of the oligonucleotide sequence deduced from a partial sequence (KPGVPME) (SEQ ID NO:7) 

3c within the N-terminal amino acid sequence (LAVMSVDLGSESMKVAIVKPGVPMEIVLNKE) (SEQ ID NO:5); the result- 
ing PCR product with a length of about 480 bp was inserted into the pT7 Blue Plasmid vector. Nucleotide sequences of 
the clones containing an insert of the expected size (480 bp) corresponding to the PCR product were determined using 
an automatic nucleotide sequencer (produced by Perkin-Elmer, Applied Biosystems). A clone containing a 39-nucle- 
otide sequence encoding a peptide identical to the rat ORP1 50-specif ic amino acid sequence KPGVPME I VLNKE (SEQ 

35 ID NO:9) in the insert was thus obtained. 

Using the above insert of the clone as a probe, RNA from cultured rat astrocytes were subjected to Northern blot- 
ting; the results demonstrated that mRNA with a length of about 4 Kb was induced by hypoxic treatment. Thereupon, 
the above insert of the clone was labeled by the random prime labeling method (Ready TOGO, produced by Pharma- 
cia) using a-[ 32 P]dCTP to yield a probe. Using this probe, 1 .2 x 10 4 clones of the cDNA library were screened by colony 

4C hybridization to obtain a clone containing a 2800 bp insert. The nucleotide sequence of this clone insert was deter- 
mined by preparing deletion mutants using a kilosequence deletion kit (produced by Takara Shuzo). 

Since this clone did not contain the 3'-region of the ORP150 coding sequence, the following two 20-base oligonu- 
cleotides were prepared on the basis of the specific nucleotide sequence near the 3' end of the above insert, to obtain 
the full-length sequence. 

45 5'-GCACCCTTGAGGAAAATGCT-3' (SEQ ID NO:10) 
5'-CCCAGAAGCCCAATGAGAAG-3' (SEQ ID NO:11) 

Using the two oligonucleotides, a clone containing the entire coding region was selected from the rat astrocyte 
cDNA library in accordance with the protocol for the Gene Trapper cDNA Positive Selection System (produced by Life 
Technology), and its nucleotide sequence was determined. 
so The nucleotide sequence of rat ORP150 cDNA was thus determined as shown by SEQ ID NO:4 in the sequence 
listing. Based on this nudeotide sequence, the amino acid sequence of rat ORP150 was determined as shown by SEQ 
ID NO:3 in the sequence listing. 

Example 5 

Preparation of human U373 cDNA library 

Poly(A) + RNA was purified from U373 cells (1 x 10 s cells) in which human ORP150 had been induced under 
hypoxic conditions, in the same manner as described in Example 3. Double-stranded cDNA was then synthesized in 
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accordance with the protocol for the Superscript Choice System (produced by Life Technology) using a 1:1 mixture of 
random hexamer primers and an oligo(dT) primer. This cDNA was inserted into the EcoRI site of the pSPORTl vector 
to prepare a cDNA library consisting of 2 x 10 5 independent clones. 

Specifically, the library was prepared as follows: Human U373 cells, cultured in 10 plastic petri dishes (150 mm in 

f diameter)(1 x 10 7 cells/dish), were subjected to hypoxic treatment for 48 hours by the method of Ogawa et al. [Ogawa, 
S., Gerlach, H., Esposito, C. Mucaulay, A. P., Brett, J., and Stern, D., J. Clin. Invest., 85, 1090-1098 (1990)] as described 
in Example 3, after which total RNA was prepared by the acid guanidinium-phenol-chloroform method [Chomczynski, 
P. and Sacchi, N., Anal. Biochem., 162, 156-159 (1987)]. Using 500 |ig of the total RNA obtained, purification was con- 
ducted twice in accordance with the protocol for poly(A)* RNA purification using oligo(dT)-magnetic beads (produced 

ic by Perceptive Diagnostics), to yield poly(A)* RNA. Double-stranded cDNA was then synthesized using 5 of the 
poly(A)* RNA and a 1:1 mixture of random hexamer primers and an oligo(dT) primer, in accordance with the protocol 
for the Superscript Choice System (produced by Life Technology), and inserted into the EcoRI site of the pSPORTl 
vector to prepare a human U373 cDNA library consisting of 2 x 10 5 independent clones. 

T5 Example 6 

Cloning of human ORP150 cDNA 

Using two primers (SEQ ID NO:10 and SEQ ID NO:11) prepared on the basis of the above-described rat ORP150 
20 cDNA specific sequence, a clone containing the entire coding region was selected from the human U373 cDNA library 
in accordance with the protocol for the Gene Trapper cDNA Positive Selection System (produced by Life Technology), 
and its nucleotide sequence was determined. The nucleotide sequence of human ORP1 50 cDNA was thus determined 
as shown by SEQ ID NO:2 in the sequence listing. 

Specifically, 2 x 10 4 clones of the human U373 cDNA library were amplified in accordance with the protocol for the 
25 Gene Trapper cDNA Positive Selection System (produced by Life Technology). Five micrograms of the plasmid purified 
from amplified clones were treated with the Gene II and Exo III nuclease included in the kit to yield single-stranded 
DNA. An oligonucleotide (SEQ ID NO: 10) prepared on the basis of the above-described rat ORP150 cDNA-specific 
sequence was biotinylated and subsequently hybridized to the above single-stranded DNA at 37°C for 1 hour. The sin- 
gle-stranded DNA hybridized to the oligonucleotide derived from rat ORP150 cDNA was selectively recovered by using 
30 streptoavidin- magnetic beads, and was treated with the repair enzyme included in the kit using the oligonucleotide 
shown by SEQ ID NO:10 in the sequence listing as a primer, to yield double-stranded plasmid DNA. 

The double-stranded plasmid DNA was then introduced to ElectroMax DH10B cells (produced by Life Technology) 
in accordance with the protocol for the Gene Trapper cDNA Positive Selection System, followed by colony PCR in 
accordance with the same protocol using two primers (SEQ ID NO:10 and SEQ ID NO:1 1) prepared on the basis of the 
35 rat ORP150 cDNA-specific sequence, to select clones that yield an about 550 bp PCR product. The nucleotide 
sequence of the longest insert among these clones, corresponding to the human ORP150 cDNA, was determined as 
shown by SEQ ID NO:2 in the sequence listing. 

On the basis of this nucleotide sequence, the amino acid sequence of human ORP1 50 was determined as shown 
by SEQ ID NO:1 in the sequence listing. 
40 The N-terminal amino acid sequence (SEQ ID NO: 5) obtained with purified rat ORP150 corresponded to amino 
acids 33-63 deduced from both the human and rat cDNAs, indicating that the first 32 residues represent the signal pep- 
tides for secretion. The C-terminal KNDEL sequence, which resembles KDEL sequence, a signal to retain the ER-res- 
ident proteins [Pelham, H.R.B., Trends Biochem. Sci. 15, 483-486 (1990)], may function as an ER-retention signal. The 
existence of a signal peptide at the N-terminus and the ER-retention signal-like sequence at the C-terminus suggests 
45 that ORP150 resides in the ER, consistent with the results of immunocytochemical analysis reported by Kuwabara et 
al. [Kuwabara, K., Matsumoto, M., Ikeda, J., Hori, O., Ogawa, S., Maeda, Y, Kitagawa, K., Imuta, N., Kinoshita, T. 
Stern, DM., Yanagi, H., and Kamada, T, J. Biol. Chem. 271, 5025-5032 (1996)]. 

Analysis of protein data bases with the BLAST program [Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lip- 
man, D.J., J. Mol., Biol. 215, 403-410(1990)] showed that the N-terminal half of ORP150 has a modest similarity to the 
5C ATPase domain of numerous HSP70 family sequences. An extensive analysis with pairwise alignments [Pearson, W.R., 
and Lipman, D.J., Proc. Natl. Acad. Sci. USA 85, 2444-2448(1988)] revealed that amino acids 33-426 of human 
ORP150 was 32% identical to amino acids 1 -380 of both inducible human HSP70. 1 [Hunt, C. and Morimoto, R.I., Proc. 
Natl. Acad. Sci. USA 82, 6455-6459 (1985)] and constitutive bovine HSC70 [DeLuca-Flaherty, C, and McKay, D.B., 
Nucleic Acids Res. 18, 5569(1990)], typical members of HSP70 family. An additional region similar to HSP70RY and 
55 hamster HSP110, which both belong to a new subfamily of large HSP70-like proteins [Lee-Yoon, D., Easton, D., Muraw- 
ski, M., Burd, R. ( and Subjeck, J R., J. Biol. Chem. 270, 15725-15733 (1995)], extended further to residue 487. A pro- 
tein sequence motif search with PROSITE [Bairoch, A., and Bucher, P., Nucleic Acids Res. 22, 3583-3589(1994)] 
showed that ORP150 contains two of the three HSP70 protein family signatures. FYDMGSGSTVCTIV (amino accfs 
230-243, SEQ ID NO:1) and VI LVGGATRVP RVQE (amino acids 380-394, SEQ ID NO 1) which completely matched 
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with the HSP70 signatures 2 and 3, respectively, and VDLG (amino acids 38-41 , SEQ ID NO: 1) which matched with the 
first four amino acids of the signature 1. Furthermore, the N-terminal region of ORP1 50 contained a putative ATP-bind- 
ing site consisting of the regions (amino acids 36-53 : 197-214, 229-243, 378-400, and 411-425, SEQ ID NO:1) corre- 
sponding to the five motifs specified by Bork et at [Bork, P., Sander. C, and Valencia, A., Proc. Natl. Acad. Sci. USA 
5 89, 7290-7294 (1992)]. Although the C-terminal putative peptide-binding domains of HSP70 family are generally less 
conserved [Rippmann, F, Taylor, W.R., Rothfcard, J.B., and Green, N.M., EMBO J. 10, 1053-1059 (1991)], the C-termi- 
na! region flanked by amino acids 701 and 898 (SEQ ID NO:1) shared appreciable similarity with HSP1 10 (amino acids 
595-793; 29% identity). 

10 Ex a m p le 7 

Cloning of human ORP15Q genomic DNA 

A human genomic library purchased from Clontech (derived from human placenta, Cat. #HL1067J, Lot #1221, 2.5 
15 x 1 0 6 independent clones) was used. A DNA fragment consisting of 202 bp of the 5' untranslated region and 369 bp of 
the coding region derived from the rat cDNA clone, as well as a 1351 bp DNA fragment containing the termination 
codon, derived from the human cDNA, were used as probes for plaque hybridization. 

Escherichia coii LE392, previously infected with 1 x 10 6 pfu of the human genomic library, was plated onto 10 petri 
dishes 15 cm in diameter to allow plaque formation. The phage DNA was transferred to a nylon membrane (Hybond- 
20 N + , Amersham) and denatured with sodium hydroxide, after which it was fixed by ultraviolet irradiation. The rat cDNA 
probe was labeled using a DNA labeling kit (Ready To Go, Pharmacia), and hybridized with the membrane in the Rapid- 
hyb buffer (Amersham). After incubation at 65°C for 2 hours, the nylon membrane was washed with 0.2 x SSC-0.1% 
SDS, and a positive clone was detected on an imaging plate (Fuji Photo Film). Since the clone isolated contained only 
exons 1 through 24, 1 .5 x 1 0 6 clones of the same library was screened again using the human cDNA probe in the same 
25 manner, resulting in isolation of one clone. This clone was found to contain exons 16 through 26, with an overlap with 
the 3' region of the above-mentioned clone. The entire region of the ORP1 50 gene was thus cloned by combining these 
two clones. 

These two clones were cleaved with BamHJ and subcloned into pBluescript IISK (Stratagene), followed by nucle- 
otide sequence determination of the entire 1 5851 bp human ORP1 50 genomic DNA. The nucleotide sequence from the 
3C 5' end to just before the translation initiation codon ATG in exon 2 is shown by SEQ ID NO: 12 in the sequence listing. 

Furthermore, the nucleotide sequence of the 15851 bp human ORP150 genomic DNA was compared with that of 
the human ORP150 cDNA shown by SEQ ID NO:2 in the sequence listing, resulting in the demonstration of the pres- 
ence of the exons at the positions shown below. A schematic diagram of the positions of the exons is shown in Figure 1 . 
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(Base position in SEQ 
ID:2) 


5 


Exon 1 


1 908 - 2002 


(1-95) 




Exon 2 


2855 - 2952 


( 96 - 193) 




Exon 3 


3179 - 3272 


( 194-287) 


1C 


Exon 4 


3451 - 3529 


( 288 - 366) 




Exon 5 


3683 - 3837 


(367 -521) 




Exon 6 


3962 - 4038 


( 522 - 598) 




Exon 7 


4347 - 4528 


( 599 - 780) 


15 


Exon 8 


4786 - 4901 


(781 -896) 




Exon 9 


6193 - 6385 


(897 - 1089) 




Exon 10 


6593 - 6727 


( 1090- 1224) 


20 


Exon 1 1 


6850 - 6932 


( 1225- 1307) 




Exon 12 


7071 - 7203 


( 1308- 1440) 




Exon 13 


7397 - 7584 


( 1441 - 1628) 




Exon 14 


7849 - 7987 


( 1629 - 1767) 


25 


Exon 15 


9176-9236 


( 1768- 1828) 




Exon 16 


9378 - 9457 


( 1829- 1908) 




Exon 17 


9810-9995 


( 1909 - 2094) 


3C 


Exon 18 


10127 -10299 


( 2095 - 2267) 




Exon 19 


10450 -10537 


( 2268 - 2355) 




Exon 20 


10643 -10765 


( 2356 - 2478) 


35 


Exon 21 


10933 -11066 


( 2479- 2612) 


Exon 22 


11195-11279 


(2613-2697) 




Exon 23 


12211 -12451 


( 2698 - 2938) 




Exon 24 


12546-12596 


( 2939 - 2989) 


40 


Exon 25 


13181 -13231 


( 2990 - 3040) 




Exon 26 


13358 -14823 


( 3041 - 4503) 



Ex a m p le g 

Northern frlot analys is 

s: A 4.5-kb EcoRI fragment of human ORP150 cDNA was labeled with [a- 32 P]dCTP(3,000 Ci/mmoi; Amersham 
Corp., Arlington Heights, IL) by using a DNA labeling kit (Pharmacia), and used as a hybridization probe. 20^ig of total 
RNA prepared from U373 cells exposed to various stresses were electrophoresed and transferred onto a Hybond 
membrane (Amersham Corp.). Multiple Tissue Northern Blots, in which each lane contained 2^g of poly(A)RNA from 
the adult human tissues indicated, was purchased from Clontech. The filter was hybridizeo at 65°C in the Rapid-hyb 

55 buffer (Amersham Corp.) with human ORP150, GRP78, HSP70, glyceraldehyde-3-phosphate dehydrogenase 
(G3PDH), and p-actin cDNAs each labeled with [a 32 -P] dCTP washed with 0.1 x SSC containing 0.1% SDS at 65 0 C, 
and followed by autoradiography. 

As shown in Figure 2, the ORP150 mRNA level was highly enhanced upon 24 - 48 hours of exposure to hypoxia. 
In parallel experiments, treatment with 2-deoxygi ucose (25 mM, 24 hours) or tunicamycin {5yig/m\, 24 hours) enhanced 
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ORP150 mRNA to the levels comparable to that induced by hypoxia. The induction levels were also comparable with 
those observed for mRNA of a typical glucose-regulated protein GRP78. Heat shock treatment failed to enhance 
OR P 150 mRNA appreciably. 

ORP150 mRNA was found to be highly expressed in the liver and pancreas, whereas little expression was 

5 observed in kidney and brain (Figure 3). Furthermore, the tissue specificity of ORP150 expression was quite similar to 
that of GRP78. The higher expression observed in the tissues that contain well-developed ER and synthesize large 
amounts of secretory proteins is consistent with the finding that ORP150 is localized in the ER (Kuwabara, K., Mat- 
sumoto, M., Ikeda, J., Hori, O., Ogawa, S., Maeda, Y, Kitagawa, K., Imuta, N., Kmoshita, T, Stern, D.M., Yanagi, H.. and 
Kamada, T, J. Biol. Chem. 271 , 5025-5032(1996)). 

ic In conclusion, both the characteristic primary protein structure and the similarity found with GRP78 in stress induc- 
ibility and tissue specificity suggest that ORP150 plays an important role in protein folding and secretion in the ER, per- 
haps as a molecular chaperone, in concert with other GRPs to cope with environmental stress. 

Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many 
equivalents to the specific embodiments of the present invention described specifically herein. Such equivalents are 

15 intended to be encompassed in the scope of the following claims. 
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10 



20 



25 



SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(iii) NUMBER OF SEQUENCES: 12 

(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 999 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

Met Ala Asp Lys Val Arg Arg Gin Arg Pro Arg Arg Arg Val Cys Trp 

5 10 15 

Ala Leu Val Ala Val Leu Leu Ala Asp Leu Leu Ala Leu Ser Asp Thr 

20 25 30 

Leu Ala Val Met Ser Val Asp Leu Gly Ser Glu Ser Met Lys Val Ala 

35 40 45 

lie Val Lys Pro Gly Val Pro Met Glu lie Val Leu Asn Lys Glu Ser 

50 ^ 55 60 

Arg Arq Lys Thr Pro Val lie Val Thr Leu Lys Glu Asn Glu Arg Phe 
65 70 75 80 

Phe Gly Asp Ser Ala Ala Ser Met Ala lie Lys Asn Pro Lys Ala Thr 

85 90 95 

Leu Arg Tyr Phe Gin His Leu Leu Gly Lys Gin Ala Asp Asn Pro His 

100 105 110 

Val Ala Leu Tyr Gin Ala Arg Phe Pro Glu His Glu Leu Thr Phe Asp 

115 " 120 125 

Pro Gin Arg Gin Thr Val His Phe Gin He Ser Ser Gin Leu Gin Phe 

130 135 140 

Ser Pro Glu Glu Val Leu Gly Met Val Leu Asn Tyr Ser Arg Ser Leu 
145 150 155 160 

Ala Glu Asp Phe Ala Glu Gin Pro He Lys Asp Ala Val He Thr Val 

165 170 175 

Pro Val Phe Phe Asn Gin Ala Glu Arg Arg Ala Val Leu Gin Ala Ala 

180 185 190 

Arg Met Ala Gly Leu Lys Val Leu Gin Leu He Asn Asp Asn Thr Ala 

195 " 200 205 

Thr Ala Leu Ser Tyr Gly Val Phe Arg Arg Lys Asp He Asn Thr Thr 

210 215 220 

Ala Gin Asn He Met Phe Tyr Asp Met Gly Ser Gly Ser Thr Val Cys 
225 230 235 240 

Thr He Val Thr Tyr Gin Met Val Lys Thr Lys Glu Ala Gly Met Gin 

245 250 255 

Pro Gin Leu Gin He Arg Gly Val Gly Phe Asp Arg Thr Leu Gly Gly 

260 ~ 265 270 

Leu Glu Met Glu Leu Arg Leu Arg Glu Arg Leu Ala Gly Leu Phe Asn 

275 ~ 280 285 

Glu Gin Arg Lys Gly Gin Arg Ala Lys Asp Val Arg Glu Asn Pro Arg 

290 " 295 300 

Ala Met Ala Lys Leu Leu Arg Glu Ala Asn Arg Leu Lys Thr Val Leu 
305 310 315 320 
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Ser Ala Asn Ala Asp His Met Ala Gin He Glu Gly Leu Met Asp Asp 

325 330 335 

Val Asp Phe Lys Ala Lys Val Thr Arg Val Glu Phe Glu Glu Leu Cys 

340 " 345 350 

Ala Asp Leu Phe Glu Arg Val Pro Gly Pro Val Gin Gin Ala Leu Gin 

355 360 365 

Ser Ala Glu Met Ser Leu Asp Glu He Glu Gin Val He Leu Val Gly 

370 375 380 

Gly Ala Thr Arg Val Pro Arg Val Gin Glu Val Leu Leu Lys Ala Val 
1u 385 ~ 390 ~ 395 400 

Gly Lys Glu Glu Leu Gly Lys Asn He Asn Ala Asp Glu Ala Ala Ala 

405 410 415 

Met Gly Ala Val Tyr Gin Ala Ala Ala Leu Ser Lys Ala Phe Lys Val 
420 425 430 

is Lys Pro Phe Val Val Arg Asp Ala Val Val Tyr Pro He Leu Val Glu 

435 ~ 440 445 

Phe Thr Arg Glu Val Glu Glu Glu Pro Gly He His Ser Leu Lys His 

450 ~ 455 460 

Asn Lys Arg Val Leu Phe Ser Arg Met Gly Pro Tyr Pro Gin Arg Lys 
465 470 " 475 480 

Val He Thr Phe Asn Arg Tyr Ser His Asp Phe Asn Phe His He Asn 

485 490 495 

Tyr Gly Asp Leu Gly Phe Leu Gly Pro Glu Asp Leu Arg Val Phe Gly 

500 505 510 

Ser Gin Asn Leu Thr Thr Val Lys Leu Lys Gly Val Gly Asp Ser Phe 
25 515 520 525 

Lys Lys Tyr Pro Asp Tyr Glu Ser Lys Gly He Lys Ala His Phe Asn 

530 ~ 535 540 

Leu Asp Glu Ser Gly Val Leu Ser Leu Asp Arg Val Glu Ser Val Phe 
545 550 555 560 

Glu Thr Leu Val Glu Asd Ser Ala Glu Glu Glu Ser Thr Leu Thr Lys 
30 565 " 570 575 

Leu Gly Asn Thr He Ser Ser Leu Phe Gly Gly Gly Thr Thr Pro Asp 

580 585 590 

Ala Lys Glu Asn Gly Thr Asp Thr Val Gin Glu Glu Glu Glu Ser Pro 

595 600 605 

Ala Glu Gly Ser Lys Asp Glu Pro Gly Glu Gin Val Glu Leu Lys Glu 

610 ~ 615 620 

Glu Ala Glu Ala Pro Val Glu Asp Gly Ser Gin Pro Pro Pro Pro Glu 
625 630 635 640 

Pro Lys Gly Asp Ala Thr Pro Glu Gly Glu Lys Ala Thr Glu Lys Glu 

645 650 655 

Asn Gly Asp Lys Ser Glu Ala Gin Lys Pro Ser Glu Lys Ala Glu Ala 

660 665 670 

Gly Pro Glu Gly Val Ala Pro Ala Pro Glu Gly Glu Lys Lys Gin Lys 

675 680 685 

Pro Ala Arg Lys Arg Arg Met Val Glu Glu He Gly Val Glu Leu Val 

690 ' " 695 700 

Val Leu Asp Leu Pro Asp Leu Pro Glu Asp Lys Leu Ala Gin Ser Val 
705 * 710 715 720 

Gin Lys Leu Gin Asp Leu Thr Leu Arg Asp Leu Glu Lys Gin Glu Arg 

725 730 735 

Glu Lys Ala Ala Asn Ser Leu Glu Ala Phe He Phe Glu Thr Gin Asp 

740 745 750 

Lys Leu Tyr Gin Pro Glu Tyr Gin Glu Val Ser Thr Glu Glu Gin Arg 
755 760 765 
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Glu Glu He Ser Gly Lys Leu Ser Ala Ala Ser Thr Trp Leu Glu Asp 

770 775 780 

Glu Gly Val Gly Ala Thr Thr Val Met Leu Lys Glu Lys Leu Ala Glu 
5 785 790 795 800 

Leu Arg Lys Leu Cys Gin Gly Leu Phe Phe Arg Val Glu Glu Arg Lys 

805 810 815 

Lys Trp Pro Glu Arg Leu Ser Ala Leu Asp Asn Leu Leu Asn His Ser 

820 825 830 

Ser Met Phe Leu Lys Gly Ala Arg Leu He Pro Glu Met Asp Gin lie 
io 835 ~ 840 845 

Phe Thr Glu Val Glu Met Thr Thr Leu Glu Lys Val He Asn Glu Thr 

850 855 860 

Trp Ala Trp Lys Asn Ala Thr Leu Ala Glu Gin Ala Lys Leu Pro Ala 
865 " ~ 870 875 880 

Thr Glu Lys Pro Val Leu Leu Ser Lys Asp He Glu Ala Lys Met Met 

885 890 895 

Ala Leu Asp Arg Glu Val Gin Tyr Leu Leu Asn Lys Ala Lys Phe Thr 

900 905 910 

Lys Pro Arg Pro Arg Pro Lys Asp Lys Asn Gly Thr Arg Ala Glu Pro 
915 920 925 

20 Pro Leu Asn Ala Ser Ala Ser Asp Gin Gly Glu Lys Val He Pro Pro 

930 935 940 

Ala Gly Gin Thr Glu Asp Ala Glu Pro He Ser Glu Pro Glu Lys Val 
945 950 955 960 

Glu Thr Gly Ser Glu Pro Gly Asp Thr Glu Pro Leu Glu Leu Gly Gly 

965 " 970 975 

Pro Gly Ala Glu Pro Glu Gin Lys Glu Gin Ser Thr Gly Gin Lys Arg 

980 985 990 

Pro Leu Lys Asn Asp Glu Leu 
995 



15 



25 



30 



(2) INFORMATION FOR SEQ ID NO : 2 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45C3 base pairs 

(B) TYPE: nucleic acid 

35 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 

( ix) FEATURE 
40 (A) NAME/KEY: CDS 

(B) IDENTIFICATION METHOD: E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

45 TTGTGAAGGG CGCGGGTGGG GGGCGCTGCC GGCCTCGTGG GTACGTTCGT GCCGCGTCTG 60 

TCCCAGAGCT GGGGCCGCAG GAGCGGAGGC AAGAGGGGCA CTATGGCAGA CAAAGTTAGG 120 

AGGCAGAGGC CGAGGAGGCG AGTCTGTTGG GCCTTGGTGG CTGTGCTCTT GGCAGACCTG 180 

5C TTGGCACTGA GTGATACACT GGCAGTGATG TCTGTGGACC TGGGCAGTGA GTCCATGAAG 240 
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GTGGCCATTG TCAAACCTGG AGTGCCCATG GAAATTGTCT TGAATAAGGA ATCTCGGAGG 300 
AAAACACCGG TGATCGTGAC CCTGAAAGAA AATGAAAGAT TCTTTGGAGA CAGTGCAGCA 360 
AGCATGGCGA TTAAGAATCC AAAGGCTACG CTACGTTACT TCCAGCACCT CCTGGGGAAG 420 
CAGGCAGATA ACCCCCATGT AGCTCTTTAC CAGGCCCGCT TCCCGGAGCA CGAGCTGACT 480 
TTCGACCCAC AGAGGCAGAC TGTGCACTTT CAGATCAGCT CGCAGCTGCA GTTCTCACCT 540 
GAGGAAGTGT TGGGCATGGT TCTCAATTAT TCTCGTTCTC TAGCTGAAGA TTTTGCAGAG 600 
CAGCCCATCA AGGATGCAGT GATCACCGTG CCAGTCTTCT TCAACCAGGC CGAGCGCCGA 660 
GCTGTGCTGC AGGCTGCTCG TATGGCTGGC CTCAAAGTGC TGCAGCTCAT CAATGACAAC 720 
ACCGCCACTG CCCTCAGCTA TGGTGTCTTC CGCCGGAAAG ATATTAACAC CACTGCCCAG 780 
AATATCATGT TCTATGACAT GGGCTChGGC AGCACCGTAT GCACCATTGT GACCTACCAG 840 
ATGGTGAAGA CTAAGGAAGC TGGGATGCAG CCACAGCTGC AGATCCGGGG AGTAGGATTT 900 
GACCGTACCC TGGGGGGCCT GGAGATGGAG CTCCGGCTTC GAGAACGCCT GGCTGGGCTT 9 60 
TTCAATGAGC AGCGCAAGGG TCAGAGAGCA AAGGATGTGC GGGAGAACCC GCGTGCCATG 1020 
GCCAAGCTGC TGCGTGAGGC TAATCGGCTC AAAACCGTCC TCAGTGCCAA CGCTGACCAC 1080 
ATGGCACAGA TTGAAGGCCT GATGGATGAT GTGGACTTCA AGGCAAAAGT GACTCGTGTG 1140 
GAATTTGAGG AGTTGTGTGC AGACTTGTTT GAGCGGGTGC CTGGGCCTGT ACAGCAGGCC 1200 
CTCCAGAGTG CCGAAATGAG TCTGGATGAG ATTGAGCAGG TGATCCTGGT GGGTGGGGCC 1260 
ACTCGGGTCC CCAGAGTTCA GGAGGTGCTG CTGAAGGCCG TGGGCAAGGA GGAGCTGGGG 1320 
AAGAACATCA ATGCAGATGA AGCAGCCGCC ATGGGGGCAG TGTACCAGGC AGCTGCGCTC 1380 
AGCAAAGCCT TTAAAGTGAA GCCATTTGTC GTCCGAGATG CAGTGGTCTA CCCCATCCTG 1440 
GTGGAGTTCA CGAGGGAGGT GGAGGAGGAG CCTGGGATTC ACAGCCTGAA GCACAATAAA 1500 
CGGGTACTCT TCTCTCGGAT GGGGCCCTAC CCTCAACGCA AAGTCATCAC CTTTAACCGC 1560 
TACAGCCATG ATTTCAACTT CCACATCAAC TACGGCGACC TGGGCTTCCT GGGGCCTGAA 1620 
GATCTTCGGG TATTTGGCTC CCAGAATCTG ACCACAGTGA AGCTAAAAGG GGTGGGTGAC 1680 
AGCTTCAAGA AGTATCCTGA CTACGAGTCC AAGGGCATCA AGGCTCACTT CAACCTGGAT 1740 
GAGAGTGGCG TGCTCAGTCT AGACAGGGTG GAGTCTGTAT TTGAGACACT GGTAGAGGAC 1800 
AGCGCAGAAG AGGAATCTAC TCTCACCAAA CTTGGCAACA CCATTTCCAG CCTGTTTGGA 1860 
GGCGGTACCA CACCAGATGC CAAGGAGAAT GGTACTGATA CTGTCCAGGA GGAAGAGGAG 1920 
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AGCCCTGCAG AGGGGAGCAA GGACGAGCCT GGGGAGChGG TGGAGCTCAA GGAGGAAGCT 19 80 
GAGGCCCCAG TGGAGGATGG CTCTCAGCCC CCACCCCCTG AACCTAAGGG AGATGCAACC 2040 
CCTGAGGGAG AAAAGGCCAC AGAAAAAGAA AATGGGGACA AGTCTGAGGC CCAGAAACCA 2100 
AGTGAGAAGG CAGAGGCAGG GCCTGAGGGC GTCGCTCCAG CCCCAGAGGG AGAGAAGAAG 2160 
CAGAAGCCCG CCAGGAAGCG GCGAATGGTA GAGGAGATCG GGGTGGAGCT GGTTGTTCTG 2220 



GACCTGCCTG ACTTGCCAGA GGATAAGCTG GCTCAGTCGG TGCAGAAACT TCAGGACTTG 2280 
ACACTCCGAG ACCTGGAGAA GCAGGAACGG GAAAAAGCTG CCAACAGCTT GGAAGCGTTC 2340 
ATATTTGAGA CCCAGGACAA GCTGTACCAG CCCGAGTACC AGGAAGTGTC CACAGAGGAG 2400 



CAGCGTGAGG AGATCTCTGG GAAGCTCAGC GCCGCATCCA CCTGGCTGGA GGATGAGGGT 2460 
GTTGGAGCCA CCACAGTGAT GTTGAAGGAG AAGCTGGCTG AGCTGAGGAA GCTGTGCCAA 2520 
GGGCTGTTTT TTCGGGTAGA GGAGCGCAAG AAGTGGCCCG AACGGCTGTC TGCCCTCGAT 2 580 
AATCTCCTCA ACCATTCCAG CATGTTCCTC AAGGGGGCCC GGCTCATCCC AGAGATGGAC 2640 
CAGATCTTCA CTGAGGTGGA GATGACAACG TTAGAGAAAG TCATCAATGA GACCTGGGCC 2700 
TGGAAGAATG CAACTCTGGC CGAGCAGGCT AAGCTGCCCG CCACAGAGAA GCCTGTGTTG 2760 
CTCTCAAAAG ACATTGAAGC TAAGATGATG GCCCTGGACC GAGAGGTGCA GTATCTGCTC 2820 
AATAAGGCCA AGTTTACCAA GCCCCGGCCC CGGCCTAAGG ACAAGAATGG GACCCGGGCA 2880 
GAGCCACCCC TCAATGCCAG TGCCAGTGAC CAGGGGGAGA AGGTCATCCC TCCAGCAGGC 2940 
CAGACTGAAG ATGCAGAGCC CATTTCAGAA CCTGAGAAAG TAGAGACTGG ATCCGAGCCA 3000 
GGAGACACTG AGCCTTTGGA GTTAGGAGGT CCTGGAGCAG AACCTGAACA GAAAGAACAA 3060 



TCGACAGGAC AGAAGCGGCC TTTGAAGAAC GACGAACTAT AACCCCCACC TCTGTTTTCC 3120 
CCATTCATCT CCACCCCCTT CCCCCACCAC TTCTATTTAT TTAACATCGA GGGTTGGGGG 3180 
AGGGGTTGGT CCTGCCCTCG GCTGGAGTTC CTTTCTCACC CCTGTGATTT GGAGGTGTGG 3240 
AGAAGGGGAA GGGAGGGACA GCTCACTGGT TCCTTCTGCA GTACCTCTGT GGTTAAAAAT 3300 
GGAAACTGTT CTCCTCCCCA GCCCCACTCC CTGTTCCCTA CCCATATAGG CCCTAAATTT 33 60 
GGGAAAAATC ACTATTAATT TCTGAATCCT TTGCCTGTGG GTAGGAAGAG AATGGCTGCC 3420 
AGTGGCTGAT GGGTCCCGGT GATGGGAAGG GTATCAGGTT GCTGGGGAGT TTCCACTCTT 3480 
CTCTGGTGAT TGTTCCTTCC CTCCCTTCCT CTCCCACCAT GCGATGAGCA TCCTTTCAGG 3540 
CCAGTGTCTG CAGAGCCTCA GTTACCAGGT TTGGTTTCTG AGTGCC7ATC TGTGCTCTTT 3 600 
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CCTCCCTCTG CGGGCTTCTC TTGCTCTGAG CCTCCCTTCC CCATTCCCAT GCAGCTCCTT 3 6 60 
TCCCCCTGGG TTTCCTTGGC TTCCTGCAGC AAATTGGGCA GTTCTCTGCC CCTTGCCTAA 3720 
AAGCCTGTAC CTCTGGATTG GCGGAAGTAA ATCTGGAAGG ATTCTCACTC GTATTTCCCA 3780 
CCCCTAGTGG CCAGAGGAGG GAGGGGCACA GTGAAGAAGG GAGCCCACCA CCTCTCCGAA 3840 
GAGGAAAGCC ACGTAGAGTG GTTGGCATGG GGTGCCAGCA TCGTGCAAGC TCTGTCATAA 3900 
TCTGCATCTT CCCAGCAGCC TGGTACCCCA GGTTCCTGTA ACTCCCTGCC TCCTCCTCTC 3960 
TTCTGCTGTT CTGCTCCTCC CAGACAGAGC CTTTCCCTCA CCCCCTGACC CCCTGGGCTG 4020 
ACCAAAATGT GCTTTCTACT GTGAGTCCCT ATCCCAAGAT CCTGGGGAAA GGAGAGACCA 4080 
TGGTGTGAAT GTAGAGATGC CACCTCCCTC TCTCTGAGGC AGGCCTGTGG ATGAAGGAGG 4140 
AGGGTCAGGG CTGGCCTTCC TCTGTGCATC ACTCTGCTAG GTTGGGGGCC CCCGACCCAC 4200 
CATACCTACG CCTAGGGAGC CCGTCCTCCA GTATTCCGTC TGTAGCAGGA GCTAGGGCTG 4260 
CTGCCTCAGC TCCAAGACAA GAATGAACCT GGCTGTTGCA GTCATTTTGT CTTTTCCTTT 4320 
TTTTTTTTTT GCCACATTGG CAGAGATGGG ACCTAAGGGT CCCACCCCTC ACCCCACCCC 4380 
CACCTCTTCT GTATGTTTGA ATTCTTTCAG TAGCTGTTGA TGCTGGTTGG ACAGGTTTGA 4440 
GTCAAATTGT ACTTTGCTCC ATTGTTAATT GAGAAACTGT TTCAATAAAA TATTCTTTTC 4500 
TAC 4503 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 999 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Ala Ala Thr Val Arg Arg Gin Arg Pro Arg Arg Leu Leu Cys Trp 

5 10 15 

Ala Leu Val Ala Val Leu Leu Ala Asp Leu Leu Ala Leu Ser Asp Thr 

20 25 30 

Leu Ala Val Met Ser Val Asp Leu Gly Ser Glu Ser Met Lys Val Ala 

35 40 45 

lie Val Lys Pro Gly Val Pro Met Glu lie Val Leu Asn Lys Glu Ser 

50 55 60 

Arg Arg Lys Thr Pro Val Thr Val Thr Leu Lys Glu Asn Glu Arg Phe 
65 ~ J 70 75 80 

Leu Gly Asp Ser Ala Ala Gly Met Ala lie Lys Asn Pro Lys Ala Thr 
85 90 95 
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20 



30 



35 



40 



50 



Leu Arg Tyr Phe Gin His Leu Leu Gly Lys Gin Ala Asp Asn Pro His 

100 105 110 

Val Ala Leu Tyr Arg Ser Arg Phe Pro Glu His Glu Leu Asn Val Asp 

115 ~ 120 125 

Pro Gin Arg Gin Thr Val Arg Phe Gin lie Ser Pro Gin Leu Gin Phe 

130 135 140 

Ser Pro Glu Glu Val Leu Gly Met Val Leu Asn Tyr Ser Arg Ser Leu 
145 150 155 160 

Ala Glu Asp Phe Ala Glu Gin Pro lie Lys Asp Ala Val He Thr Val 

165 170 175 

Pro Ala Phe Phe Asn Gin Ala Glu Arg Arg Ala Val Leu Gin Ala Ala 

180 185 ~ 190 

Arg Met Ala Gly Leu Lys Val Leu Gin Leu He Asn Asp Asn Thr Ala 

195 " 200 205 

Thr Ala Leu Ser Tyr Gly Val Phe Arg Arg Lys Asp He Asn Ser Thr 

210 ~ 215 ~ 220 

Ala Gin Asn He Met Phe Tyr Asp Met Gly Ser Gly Ser Thr Val Cys 
225 230 235 240 

Thr He Val Thr Tyr Gin Thr Val Lys Thr Lys Glu Ala Gly Thr Gin 

245 250 255 

Pro Gin Leu Gin He Arg Gly Val Gly Phe Asp Arg Thr Leu Gly Gly 

260 265 270 

Leu Glu Met Glu Leu Arg Leu Arg Glu His Leu Ala Lys Leu Phe Asn 

275 ~ 280 285 

Glu Gin Arg Lys Gly Gin Lys Ala Lys Asp Val Arg Glu Asn Pro Arg 

290 " 295 300 

Ala Met Ala Lys Leu Leu Arg Glu Ala Asn Arg Leu Lys Thr Val Leu 
305 310 315 320 

Ser Ala Asn Ala Asp His Met Ala Gin He Glu Gly Leu Met Asp Asp 

325 330 335 

Val Asp Phe Lys Ala Lys Val Thr Arg Val Glu Phe Glu Glu Leu Cys 

340 345 350 

Ala Asp Leu Phe Asp Arg Val Pro Gly Pro Val Gin Gin Ala Leu Gin 

355 360 365 

Ser Ala Glu Met Ser Leu Asp Gin He Glu Gin Val He Leu Val Gly 

370 375 380 

Gly Pro Thr Arg Val Pro Lys Val Gin Glu Val Leu Leu Lys Pro Val 
385 ** 390 " 395 400 

Gly Lys Glu Glu Leu Gly Lys Asn He Asn Ala Asp Glu Ala Ala Ala 

405 " " 410 415 

Met Gly Ala Val Tyr Gin Ala Ala Ala Leu Ser Lys Ala Phe Lys Val 

420 425 430 

Lys Pro Phe Val Val Arg Asp Ala Val He Tyr Pro He Leu Val Glu 

435 " 440 445 

Phe Thr Arg Glu Val Glu Glu Glu Pro Gly Leu Arg Ser Leu Lys His 

450 455 460 

Asn Lys Arg Val Leu Phe Ser Arg Met Gly Pro Tyr Pro Gin Arg Lys 
465 470 475 480 

Val He Thr Phe Asn Arg Tyr Ser His Asp Phe Asn Phe His He Asn 

485 ~ 490 495 

Tyr Gly Asp Leu Gly Phe Leu Gly Pro Glu Asp Leu Arg Val Phe Gly 

500 "* 505 510 

Ser Gin Asn Leu Thr Thr Val Lys Leu Lys Gly Val Gly Glu Ser Phe 

515 520 525 

Lys Lys Tyr Pro Asp Tyr Glu Ser Lys Gly He Lys Ala His Phe Asn 
530 535 540 
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Leu Asp Glu Ser Gly Val Leu Ser Leu Asp Arg Val Glu Ser Val Phe 
545 ^ 550 555 560 

Glu Thr Leu Val Glu Asp Ser Pro Glu Glu Glu Ser Thr Leu Thr Lys 

565 * 570 575 

Leu Gly Asn Thr He Ser Ser Leu Phe Gly Gly Gly Thr Ser Ser Asp 

580 585 590 

Ala Lys Glu Asn Gly Thr Asp Ala Val Gin Glu Glu Glu Glu Ser Pro 

595 " 600 605 

Ala Glu Gly Ser Lys Asp Glu Pro Ala Glu Gin Gly Glu Leu Lys Glu 

610 * 615 620 

Glu Ala Glu Ala Pro Met Glu Asp Thr Ser Gin Pro Pro Pro Ser Glu 
625 630 635 640 

Pro Lys Gly Asp Ala Ala Arg Glu Gly Glu Thr Pro Asp Glu Lys Glu 

645 650 655 

Ser Gly Asp Lys Ser Glu Ala Gin Lys Pro Asn Glu Lys Gly Gin Ala 

660 665 670 

Gly Pro Glu Gly Val Pro Pro Ala Pro Glu Glu Glu Lys Lys Gin Lys 

675 680 685 

Pro Ala Arg Lys Gin Lys Met Val Glu Glu He Gly Val Glu Leu Ala 

690 '" 695 700 

Val Leu Asp Leu Pro Asp Leu Pro Glu Asp Glu Leu Ala His Ser Val 
705 * 710 715 720 

Gin Lys Leu Glu Asp Leu Thr Leu Arg Asp Leu Glu Lys Gin Glu Arg 

725 730 735 

Glu Lys Ala Ala Asn Ser Leu Glu Ala Phe He Phe Glu Thr Gin Asp 
25 740 745 750 

Lys Leu Tyr Gin Pro Glu Tyr Gin Glu Val Ser Thr Glu Glu Gin Arg 

755 760 765 

Glu Glu He Ser Gly Lys Leu Ser Ala Thr Ser Thr Trp Leu Glu Asp 

770 775 780 

Glu Gly Phe Gly Ala Thr Thr Val Met Leu Lys Asp Lys Leu Ala Glu 
30 785 790 795 800 

Leu Arg Lys Leu Cys Gin Gly Leu Phe Phe Arg Val Glu Glu Arg Arg 

805 810 815 

Lys Trp Pro Glu Arg Leu Ser Ala Leu Asp Asn Leu Leu Asn His Ser 
820 ~ 825 830 

35 Ser He Phe Leu Lys Gly Ala Arg Leu He Pro Glu Met Asp Gin He 

835 840 845 

Phe Thr Asp Val Glu Met Thr Thr Leu Glu Lys Val He Asn Asp Thr 

850 855 860 

Trp Thr Trp Lys Asn Ala Thr Leu Ala Glu Gin Ala Lys Leu Pro Ala 
865 870 875- 880 

4 " Thr Glu Lys Pro Val Leu Leu Ser Lys Asp He Glu Ala Lys Met Met 

885 890 895 

Ala Leu Asp Arg Glu Val Gin Tyr Leu Leu Asn Lys Ala Lys Phe Thr 

900 905 910 

Lys Pro Arg Pro Arg Pro Lys Asp Lys Asn Gly Thr Arg Thr Glu Pro 
ts 915 ~ 920 925 

Pro Leu Asn Ala Ser Ala Gly Asp Gin Glu Glu Lys Val He Pro Pro 

930 935 940 

Thr Gly Gin Thr Glu Glu Ala Lys Ala He Leu Glu Pro Asp Lys Glu 
945 ~ 950 955 960 

Gly Leu Gly Thr Glu Ala Ala Asp Ser Glu Pro Leu Glu Leu Gly Gly 

965 970 975 

Pro Gly Ala Glu Ser Glu Gin Ala Glu Gin Thr Ala Gly Gin Lys Arg 
980 985 990 
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Pro Lea Lys Asn Asp Glu Leu 
995 

5 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3252 base pairs 

(B) TYPE: nucleic acid 

10 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

75 ( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) IDENTIFICATION METHOD: E 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 
20 TGAGGATGGA GCAGCGGTCG GGCCGCGGCT CCTAGGGGAG GCAGCGTGCT AGCTTCGGGG 60 

GCGGGCCAGT AGCGGGAGCG AGGGCCGTAC GGACACCGGT CCCTTCGGCC TTGAAGTTCA 120 
GGCGCTGAGC TGCCCCCTCG CGCTCGGGGT GGGCCGGAAT CCATTTCTGG GAGTGGGATC 180 
TTCCACCTTC ATCAGGGTCA CAATGGCAGC TACAGTAAGG AGGCAGAGGC CAAGGAGGCT 240 
ACTCTGTTGG GCCTTGGTGG CTGTCCTCTT GGCAGACCTG TTGGCACTGA GTGACACACT 300 
GGCTGTGATG TCTGTGGACC TGGGCAGTGA ATCCATGAAG GTGGCCATTG TCAAGCCTGG 360 

30 

AGTGCCCATG GAGATTGTAT TGAACAAGGA ATCTCGGAGG AAAACTCCGG TGACTGTGAC 420 
CTTGAAGGAA AACGAAAGGT TTCTAGGTGA CAGTGCAGCT GGCATGGCCA TCAAGAACCC 480 

35 AAAGGCTACG CTCCGTTATT TCCAGCACCT CCTTGGAAAG CAGGCAGATA ACCCTCATGT 540 

GGCTCTTTAC CGGTCCCGTT TCCCAGAACA TGAGCTCAAT GTTGACCCAC AGAGGCAGAC 600 
TGTGCGCTTC CAGATCAGTC CGCAGCTGCA GTTCTCTCCC GAGGAGGTGC TGGGCATGGT 6 60 

4C TCTCAACTAC TCCCGTTCCC TGGCTGAAGA TTTTGCAGAA CAACCTATTA AGGATGCAGT 720 

GATCACCGTG CCAGCCTTTT TCAACCAGGC CGAGCGCCGA GCTGTGCTGC AGGCTGCTCG 780 
TATGGCTGGC CTCAAGGTGC TGCAGCTCAT CAATGACAAC ACTGCCACAG CCCTCAGCTA 840 

45 TGGTGTCTTC CGCCGGAAAG ATATCAATTC CACTGCACAG AATATCATGT TCTATGACAT 900 

GGGCTCGGGC AGCACTGTGT GTACCATCGT GACCTACCAA ACGGTGAAGA CTAAGGAGGC 960 
TGGGACGCAG CCACAGCTAC AGATCCGGGG CGTGGGATTT GACCGCACCC TGGGTGGCCT 1020 

5C 

GGAGATGGAG CTTCGGCTGC GAGAGCACCT GGCTAAGCTC TTCAATGAGC AGCGCAAGGG 1080 
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CCAGAAAGCC AAGGATGTTC GGGAAAACCC CCGAGCCATG GCCAAACTGC TTCGGGAAGC 1140 
CAATCGGCTT AAAACCGTCC TGAGTGCCAA TGCTGATCAC ATGGCACAGA TTGAAGGCTT 1200 
GATGGACGAT GTGGACTTCA AGGCAAAAGT AACTCGAGTG GAGTTTGAGG AGCTGTGTGC 1260 
AGATTTGTTT GATCGAGTGC CTGGGCCTGT ACAGCAGGCC CTGCAGAGTG CTGAGATGAG 1320 
CCTGGATCAA ATTGAGCAGG TGATCCTGGT GGGTGGGCCC ACTCGTGTTC CCAAAGTTCA 1380 
AGAGGTGCTG CTGAAGCCTG TGGGCAAGGA GGAACTAGGA AAGAACATCA ATGCCGATGA 1440 
AGCAGCTGCC ATGGGGGCCG TGTACCAGGC AGCGGCACTG AGCAAAGCCT TCAAAGTGAA 1500 
GCCATTTGTT GTGCGTGATG CTGTTATTTA CCCCATCCTG GTGGAGTTCA CAAGGGAGGT 1560 
GGAGGAGGAG CCTGGGCTTC GAAGCCTGAA GCACAATAAA CGTGTGCTCT TCTCCCGAAT 1620 
GGGGCCCTAC CCTCAGCGCA AAGTCATCAC CTTTAACCGA TACAGCCATG ATTTCAACTT 1680 
TCACATCAAC TACGGTGACC TGGGCTTCCT GGGGCCTGAG GATCTTCGGG TATTTGGCTC 1740 
CCAGAATCTG ACCACAGTGA AACTAAAAGG TGTGGGAGAG AGCTTCAAGA AATATCCTGA 1800 
CTATGAGTCC AAAGGCATCA AGGCCCACTT TAACCTAGAC GAGAGTGGAG TGCTCAGTTT 1860 
AGACAGGGTG GAGTCCGTAT TCGAGACCCT GGTGGAGGAC AGCCCAGAGG AAGAGTCTAC 1920 
TCTTACCAAA CTTGGCAACA CCATTTCCAG CCTGTTTGGC GGTGGTACCT CATCAGATGC 1980 
CAAAGAGAAT GGTACTGATG CTGTACAGGA GGAGGAGGAG AGCCCTGCTG AGGGGAGCAA 2040 
GGATGAGCCT GCAGAACAGG GGGAACTCAA GGAGGAAGCT GAAGCCCCAA TGGAGGATAC 2100 
CTCCCAGCCT CCACCCTCTG AGCCTAAGGG GGATGCAGCC CGTGAGGGAG AAACACCTGA 2160 
TGAAAAAGAA AGTGGGGACA AGTCTGAGGC CCAGAAGCCC AATGAGAAGG GGCAGGCAGG 2220 
GCCTGAGGGT GTCCCTCCAG CTCCCGAGGA AGAAAAAAAG CAGAAACCTG CCCGGAAGCA 2280 
GAAAATGGTG GAGGAGATAG GTGTGGAACT GGCTGTCTTG GACCTGCCAG ACTTGCCAGA 2340 
GGATGAGCTG GCCCATTCCG TGCAGAAACT TGAGGACTTG ACCCTGCGAG ACCTTGAAAA 2400 
GCAGGAGAGG GAGAAAGCTG CCAACAGCTT AGAAGCTTTT ATCTTTGAGA CCCAGGACAA 2460 
ACTGTACCAA CCTGAGTACC AGGAAGTGTC CACTGAGGAA CAACGGGAGG AGATCTCTGG 2520 
AAAACTCAGT GCCACTTCTA CCTGGCTGGA GGATGAGGGA TTTGGAGCCA CCACTGTGAT 2 580 
GTTGAAGGAC AAGCTGGCTG AGCTGAGAAA GCTGTGCCAA GGGCTGTTTT TTCGGGTGGA 2 640 
AGAGCGCAGG AAATGGCCAG AGCGGCTTTC AGCTCTGGAT AATCTCCTCA ATCACTCCAG 2700 
CATTTTCCTC AAGGGTGCCC GACTCATCCC AGAGATGGAC CAGATCTTCA CTGACGTGGA 2760 
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GATGACAACG TTGGAGAAAG TCATCAATGA CACCTGGACC TGGAAGAATG CAACCCTGGC 2 820 
CGAGCAGGCC AAGCTTCCTG CCACAGAGAA ACCCGTGCTG CTTTCAAAAG ACATCGAGGC 2880 
CAAAATGATG GCCCTGGACC GGGAGGTGCA GTATCTACTC AATAAGGCCA AGTTTACTAA 2940 
ACCCCGGCCA CGGCCCAAGG ACAAGAATGG CACCCGGACA GAGCCTCCCC TCAATGCCAG 3000 
TGCTGGTGAC CAAGAGGAAA AGGTCATTCC ACCTACAGGC CAGACTGAAG AGGCGAAGGC 3060 
CATCTTAGAA CCTGACAAAG AAGGGCTTGG TACAGAGGCA GCAGACTCTG AGCCTCTGGA 3120 
ATTAGGAGGT CCTGGTGCAG AATCTGAACA GGCAGAGCAG ACAGCAGGGC AGAAGCGGCC 3180 
TTTGAAGAAT GATGAGCTGT GACCCCGCGC CTCCGCTCCA CTTGCCTCCA GCCCCTTCTC 3240 
CTACCACCTC TA 3252 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

Leu Ala Val Met Ser Val Asp Leu Gly Ser Glu Ser Met Lys Val Ala 
3C 5 10 15 

lie Val Lys Pro Gly Val Pro Met Glu lie Val Leu Asn Lys Glu 
20 25 30 



(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid, synthetic nucleic 

acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 
AATACGACTC ACTATAGGGA 20 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Lys Pro Gly Val Pro Met Glu 
5 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: other nucleic acid, synthetic nucleic 

acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
AARCCiGGiG TNCCNATGGA 20 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Lys Pro Gly Val Pro Met Glu lie Val Leu Asn Lys Glu 
5 10 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid, synthetic nucleic 

acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GCACCCTTGA GGAAAATGCT 20 
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(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid, synthetic nucleic 

acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CCCAGAAGCC CAATGAGAAG 20 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2861 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GAAAGAAGTA GAC AT GGGAG ACTTCATTTT GTTCTGTACT AAGAAAAATT CTTCTGCCTT 60 
GGGATGCTGT TGATCTATGA CCTTACCCCC AACCCTGTGC TCTCTGAAAC ATGTGCTGTG 120 
TCCACTCAGG GTTAAATGGA TTAAGGGCGG TGCAAGATGT GCTTTGTTAA ACAGATGCTT 180 
GAAGGCAGCA TGCTCGTTAG GAGTCATCAC CACTCCCTAA TCTCAAGTAC CCAGGGACAC 240 
AAACACTGCG GAAGGCCACA GGGTCCTCTG CCTAGGAAAG CCAGAGACCT TTGTTCACTT 300 
GTTTATCTGC TGACCTTCCC TCCACTATTG TCCTATGACC CTGCCAAATC CCCCTCTGCC 360 
AGAAACACCC AAGAATGATC AATAAAAAAA AAAAAAAAAA AAAAAGGAAG AATAGACTCT 420 
40 CTCTGGGACT GCCAATAATT TTTCCTTCTA AGCATAGACA CCGGACCACT CTCCACCTAA 480 

GCATCACGAA AAATGTAGAG AAAGGAAGAG CTAAGAGCTC CTTAAACAAG TTCAGGCTTG 540 
ACACAACCCT GGCCCTGACA GCCAGGGTCT TCAAGCGGGC CTTTCTGTGA AGGGTGGCCA 600 

45 

GGCATCAACT TAGTAGGAGA GAAAACAGAT GACTTATTTC CATCCACACT TAACGAAAAT 6 60 
GCAGTCTCCA AGGACTGCGT ACATTTCTTT TTCGAGAAGG AGTCTCGCTG TTGTCGCCCA 720 
GGCTGGAGTG CAGTGGCGCA GTCTGGGCTC ACAGCAACCT CTGCCTCCCG GATTCAAGCA 780 
ATTCTCCTGC CTCAGCCTCG TGAGTAGCTG GGATTACAGG CACCCGCCAC CACGCCTGGC 840 
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TAATTTTTGT AGTTTTGGTA GAGACGGGGT TTCACCATGT TGGCCAGGCT GGTCTCGAAC 900 
TCCTGACCTC CAGTGATTCG CCCGCCTTGG CCTCCCAAAA TGCTGGGATT ACAGGCGTGA 960 
GCCACCGCGC CCGGGCGACT GCGCACATTT CTATGGAGCT GTAAGTTAAA AGAGAAGGCA 1020 
GTGAGGTGCT TCTGTCATTC TATGACAGAA ACAGCTAAAG AGTAGAGAAA TGTTCACAAG 1080 
ATTTAATAGA ACAGAAATAG GAGAAGGTGC ACACAAGCTC AACCAACTAT AGCCTCACAA 1140 
ATAAAAGTGT CTTTTGTGTG TAGTACTTAA GTTTGGAATA TTCTTTCTTA TACAAATGAG 1200 
TGGGGCTTAA CCTAAGAAAT CCTGGCCAGA TTCTGCGACG AATGCATCGG TTATCTCTGA 1260 
CCCATCAGCA AACATCTTTT TCTGTGGCTT CAGTTTCCTC AGTAAAACAG AGGGGGTTGC 1320 
GACGGACTCA GTCCGAGGCA CAGCCATTCT CCAACGTCTA TCCAAAGCCT AGGGCACCTC 1380 
AATACTAACC GGCAGGCCAG CGCCCCCTCC GCGGGGCTGC GGACAGGACG CCTGTTATTC 1440 
CATTCCTCGG CCGGGCTCTA CAGGTGACCG GAAGAAGAGC CCCGAGTGCG GGACTGCAGT 1500 
GCGCCCGACC TGCTCTAGGC GCAGGTCACT CCCGAACCCC GGCAGCAAAG CATCCAGCGC 1560 
CGGAAAAGGT CCCGCGGTCG CCCCGGGGCC GGCGCTGGGG AGGAAGGAGT GGAGCGCGCT 1620 
GGCCCCGTGA CGTGGTCCAA TCCCAGGCCG ACGCCGGCTG CTTCTGCCCA ACCGGTGGCT 1680 
GGTCCCCTCC GCCGCCCCCA TTACAAGGCT GGCAAAGGGA GGGGGCGGGG CCTGGGACGT 1740 
GGTCCAATGA GTACGCGCGC CGGGGCGGCG GGGGCGGGGC CGGGCGCGCA GCGCAGGGCC 1800 
GGGCGGCCGA GGCTCCAATG AGCGCCCGCC GCGTCCGGGG CCGGCTGGTG CGCGAGACGC 1860 
CGCCGAGAGG TTGGTGGCTA ATGTAACAGT TTGCAAACCG AGAGGAGTTG TGAAGGGCGC 1920 
GGGUGGGGGG CGCTGCCGGC CTCGTGGGTA CGTTCGTGCC GCGTCTGTCC CAGAGCTGGG 1980 
GCCGCAGGAG CGGAGGCAAG AGGTAGCGGG GGTGGATGGA GGTGCGGGCC GGCCACCCCT 2040 
CCTAGGGGAG ACAGCGTGCG AGCTCCGGGG GCGGGTCGGG AGCGCAAGGG AGGGCCGCGC 2100 
GGACGCCGGG CGCTCGGCCT CGCACCGGGG GGCACGCAGC TCGGCCCCCG GTCTGTCCCC 2160 
ACTTGCTGGG GCGGGCCGGG ATCCGTTTCC GGGAGTGGGA GCCGCCGCCT TCGTCAGGTG 2220 
GGGTTTAGGT GAACACCGGG TAACGGCTAC CCGCCGGGCG GGGAACCTTA CCGCCCCTGG 2280 
CACTGCGTCT GTGGGCACAG CGGGGCCGGG GAGTGAGCTG GGAAAGGGGA GGGGGCGGGA 2340 
CAACCCGCAG GGATGCCGAG GAGGAGATAG GCCTTTCCTT CATCCTAGCT ACCCCCAACG 2400 
TCATTACCTT TCTCTTCCCG TCCAGGCCCA GCTGGCTTTC CCCGTCAGCG GGGGAGCTCC 24 60 
AGGTGTGGGG AGGTGGTTGA GCCCTGGGCG GGGATCCCTG GCCGCACCCC AGGTGTCTGA 2520 
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CAACAGGCAC AGTGCTGCGG TGCGCCACTC ACTGCCTGTG TGGTGGACAA AAGGCTCGGG 2580 
TCTCCTTTCT CTTGTCCTGT TAGCTTCTCT GTTTAGGGAT GTGGCAAAGC CGAGGACCCA 2640 
TGCTCTTTCA CTTGGGCCTT TGTGTGGGCG CTGCTGGGAT GATTAGAGAA TGGTTTGTAC 2700 
CCATCAGGAG GGAGAAGGGG AGAAGTAGGC TGATCTGCCC TGGGTAAGAA TGAAGTAGAT 2760 
ATGAATCTTA CAGCCTCTCC GTTCTGGGAT GTGATTCTGT CTCCTTCACT CCGGGTATCC 2820 
AGTTTTAAGT GTTTTCTTTC TTCGCCTCCC CCAGGGGCAC T 28 61 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME : KSP Research Institute, Inc. 

(B) STREET : 2-8, Doshomachi 2-chorr.e, Chuc-ku, 

(C) CITY: Osaka-shi, Osaka 

(E) COUNTRY: JP 

(F) POSTAL CODE (ZIP) : none 

(ii) TITLE OF INVENTION : STRESS PROTEINS 
(iii) NUMBER OF SEQUENCES: 12 

{iv} COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

<C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 

(v) CURRENT APPLICATION DATA: 

APPLICATION NUMBER: EP 96 12 0622.0 
(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: JP 7-349661 

(B) FILING DATE: 20-DEC-1995 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: JP 8-213181 

(B) FILING DATE: 23-JUL-1996 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 99 9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

Met Ala Asp Lys Val Arg Arg Gin Arg Pro Arg Arg Arg Val Cys Trp 
1 5 10 15 

Ala Leu Val Ala Val Leu Leu Ala Asp Leu Leu Ala Leu Ser Asp Thr 
20 25 30 

Leu Ala Val Met Ser Val Asp Leu Gly Ser Glu Ser Met Lys Val Ala 
35 40 45 

lie Val Lys Pre Gly Val Pro Met Glu lie Val Leu Asn Lys Glu Ser 
50 55 60 

Arg Arg Lys Thr Pro Val He Val Thr Leu Lys Glu Asn Glu Arg Phe 
65 ' 70 75 80 

Phe Gly Asp Ser Ala Ala Ser Met Ala He Lys Asn Pro Lys Ala Thr 
8 5 9 0 9 5 
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Leu Arg Tyr Phe Gin His Leu Leu Gly Lys Gin Ala Asp Asn Pro His 
ICO ICS 110 

Val Ala Leu Tyr Gin Ala Arg Phe Pro Glu His Glu Leu Thr Phe Asp 
115 120 125 

Pro Gin Arg Gin Thr Val His Phe Gin lie Ser Ser Gin Leu Gin Phe 
130 135 140 

Ser Pro Glu Glu Val Leu Gly Met Val Leu Asn Tyr Ser Arg Ser Leu 
145 150 155 160 

Ala Glu Asp Phe Ala Glu Gin Pro lie Lys Asp Ala Val He Thr Val 
165 170 175 

Pro Val Phe Phe Asn Gin Ala Glu Arg Arg Ala Val Leu Gin Ala Ala 
180 185 190 

Arg Met Ala Gly Leu Lys Val Leu Gin Leu He Asn Asp Asn Thr Ala 
195 200 205 

Thr Ala Leu Ser Tyr Gly Val Phe Arg Arg Lys Asp He Asn Thr Thr 
210 215 220 

Ala Gin Asn He Met Phe Tyr Asp Met Gly Ser Gly Ser Thr Val Cys 
225 230 235 240 

Thr He Val Thr Tyr Gin Met Val Lys Thr Lys Glu Ala Gly Met Gin 
245 250 255 

Pro Gin Leu Gin He Arg Gly Val Gly Phe Asp Arg Thr Leu Gly Gly 
260 265 270 

Leu Glu Met Glu Leu Arg Leu Arg Glu Arg Leu Ala Gly Leu Phe Asn 
275 280 285 

Glu Gin Arg Lys Gly Gin Arg Ala Lys Asp Val Arg Glu Asn Pro Arg 
290 295 300 

Ala Met Ala Lys Leu Leu Arg Glu Ala Asn Arg Leu Lys Thr Val Leu 
305 310 315 320 

Ser Ala Asn Ala Asp His Met Ala Gin He Glu Gly Leu Met Asp Asp 
325 330 335 

Val Asp Phe Lys Ala Lys Val' "Thr Arg Val Glu Phe Glu Glu Leu Cys 
340 345 350 

Ala Asp Leu Phe Glu Arg Val Pro Gly Pro Val Gin Gin Ala Leu Gin 
355 360 365 

Ser Ala Glu Met Ser Leu Asp Glu He Glu Gin Val He Leu Val Gly 
370 375 380 

Gly Ala Thr Arg Val Pro Arg Val Gin Glu Val Leu Leu Lys Ala Val 
385 350 395 400 

Gly Lys Glu Glu Leu Gly Lys Asn He Asn Ala Asp Glu Ala Ala Ala 
405 410 415 
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Met Gly Ala Val Tyr Gin Ala Ala Ala Leu Ser Lys Ala Fhe Lys Val 
420 425 430 

Lys Pro Phe Val Val Arg Asp Ala Val Val Tyr Pro lie Leu Val Glu 
435 440 445 

Phe Thr Arg Glu Val Glu Glu Glu Pro Gly lie His Ser Leu Lys Kis 
450 455 460 

Asn Lys Arg Val Leu Phe Ser Arg Met Gly Pro Tyr Pro Gin Arg Lys 
465 470 475 480 

Val He Thr Phe Asn Arg Tyr Ser Kis Asp Phe Asn Phe Kis He Asn 
485 490 495 

Tyr Gly Asp Leu Gly Phe Leu Gly Pro Glu Asp Leu Arg Val Phe Gly 
500 505 510 

Ser Gin Asn Leu Thr Thr Val Lys Leu Lys Gly Val Gly Asp Ser Phe 
S15 520 525 

Lys Lys Tyr Pro Asp Tyr Glu Ser Lys Gly lie Lys Ala His Phe Asn 
530 535 540 

Leu Asp Glu Ser Gly Val Leu Ser Leu Asp Arg Val Glu Ser Val Phe 
545 550 555 560 

Glu Thr Leu Val Glu Asp Ser Ala Glu Glu Glu Ser Thr Leu Thr Lys 
565 570 575 

Leu Gly Asn Thr He Ser Ser Leu Phe Gly Gly Gly Thr Thr Pro Asp 
580 585 590 

Ala Lys Glu Asn Gly Thr Asp Thr Val Gin Glu Glu Glu Glu Ser Pro 
595 ^ 600 605 

Ala Glu Gly Ser Lys Asp Glu Pro Gly Glu Gin Val Glu Leu Lys Glu 
610 615 620 

Glu Ala Glu Ala Pro Val Glu Asp Gly Ser Gin Pro Pro Pro Pro Glu 
625 630 635 640 

Pro Lys Gly Asp Ala Thr Pro Glu Gly Glu Lys Ala Thr Glu Lys Glu 
645 650 655 

Asn Gly Asp Lys Ser Glu Ala Gin Lys Pro Ser Glu Lys Ala Glu Ala 
660 665 670 

Gly Pro Glu Gly Val Ala Pro Ala Pro Glu Gly Glu Lys Lys Gin Lys 
675 680 685 

Pro Ala Arg Lys Arg Arg Met Val Glu Glu He Gly Val Glu Leu Val 
690 695 700 

Val Leu Asp Leu Pro Asp Leu Pro Glu Asp Lys Leu Ala Gin Ser Val 
705 710 715 720 

Gin Lys Leu Gin Asp Leu Thr Leu Arg Asp Leu Glu Lys Gin Glu Arg 

725 730 735 
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Glu Lvs Ala Ala Asn Ser Leu Glu Ala Phe lie Phe Glu Thr Gin Asp 
740 745 750 

Lys Leu Tyr Gin Pro Glu Tyr Gin Glu Val Ser Thr Glu Glu Gin Arg 
755 76 0 76 5 

Glu Glu He Ser Gly Lys Leu Ser Ala Ala Ser Thr Trp Leu Glu Asp 
770 775 780 

10 Glu Gly Val Gly Ala Thr Thr Val Met Leu Lys Glu Lys Leu Ala Glu 

7B5 ~ 790 795 SCO 

Leu Arg Lys Leu Cys Gin Gly Leu Phe Phe Arg Val Glu Glu Arg Lys 
8 05 810 815 

15 Lys Trp Pro Glu Arg Leu Ser Ala Leu Asp Asn Leu Leu Asn His Ser 

820 825 830 

Ser Met Phe Leu Lys Gly Ala Arg Leu He Pro Glu Met Asp Gin He 
835 840 845 

Phe Thr Glu Val Glu Met Thr Thr Leu Glu Lys Val lie Asn Glu Thr 
850 855 860 

Trp Ala Trp Lys Asn Ala Thr Leu Ala Glu Gin Ala Lys Leu Pro Ala 
865 870 875 8BC 

Thr Glu Lys Pro Val Leu Leu Ser Lys Asp He Glu Ala Lys Met Met 
885 890 895 

Ala Leu Asp Arg Glu Val Gin Tyr Leu Leu Asn Lys Ala Lys Phe Thr 
900 905 910 

Lys Pro Arg Pro Arg Pro Lys Asp Lys Asn Gly Thr Arg Ala Glu Pro 
915 920 925 

Pro Leu Asn Ala Ser Ala Ser Asp Gin Gly Glu Lys Val He Pro Pro 
930 935 940 

Ala Gly Gin Thr Glu Asp Ala Glu Pro He Ser Glu Pro Glu Lys Val 
945 950 955 960 

Glu Thr Gly Ser Glu Pro Gly Asp Thr Glu Pro Leu Glu Leu Gly Gly 
40 ' ' 965 970 975 

Pro Gly Ala Glu Pro Glu Gin Lys Glu Gin Ser Thr Gly Gin Lys Arg 
980 * - 985 990 

Pro Leu Lys Asn Asp Glu Leu 
45 995 

(2) INFORMATION FOR SEQ ID NO : 2: 



25 



35 



5C 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4503 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 
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£ii; MOLECULE TYPE: cDNA 



15 



20 



25 



30 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 103 . .3399 



(xij SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

TTGTGAAGGG CGCGGGTGGG GGGCGCTGCC GGCCTCGTGG GTACGTTCGT GCCGCGTCTG 6 0 

TCCCAGAGCT GGGGCCGCAG GAGCGGAGGC AAGAGGGGCA CT ATG GCA GAC AAA 114 

Met Ala Asp Lys 
1 

GTT AGG AGG GAG AGG CCG AGG AGG CGA GTC TGT TGG GCC TTG GTG GCT 162 
Val Arg Arg Gin Arg Pro Arg Arg Arg Val Cys Trp Ala Leu Val Ala 
5 10 15 20 

GTG CTC TTG GCA GAC CTG TTG GCA CTG AGT GAT ACA CTG GCA GTG ATG 210 
Val Leu Leu Ala Asp Leu Leu Ala Leu Ser Asp Thr Leu Ala Val Met 
25 30 35 

TCT GTG GAC CTG GGC AGT GAG TCC ATG AAG GTG GCC ATT GTC AAA CCT 2 58 

Ser Val Asp Leu Gly Ser Glu Ser Met Lys Val Ala lie Val Lys Pro 
40 45 50 

GGA GTG CCC ATG GAA ATT GTC TTG AAT AAG GAA TCT CGG AGG AAA ACA 3 06 

Gly Val Pro Met Glu lie Val Leu Asn Lys Glu Ser Arg Arg Lys Thr 
5 5 6 0 6 5 

CCG GTG ATC GTG ACC CTG AAA GAA AAT GAA AGA TTC TTT GGA GAC AGT 3 54 

Pro Val lie Val Thr Leu Lys Glu Asn Glu Arg Phe Phe Gly Asp Ser 
70 75 80 



GCA GCA AGC ATG GCG ATT AAG AAT CCA AAG GCT ACG CTA CGT TAC TTC 4 02 

Ala Ala Ser Met Ala lie Lys Asn Pro Lys Ala Thr Leu Arg Tyr Phe 
35 65 90 95 100 

GAG CAC CTC CTG GGG AAG GAG GCA GAT AAC CCC CAT GTA GCT CTT TAC 4 50 

Gin His Leu Leu Gly Lys Gin Ala Asp Asn Pro His Val Ala Leu Tyr 
105 110 115 

40 CAG GCC CGC TTC CCG GAG CAC GAG CTG ACT TTC GAC CCA CAG AGG CAG 4 98 

Gin Ala Arg Phe Pro Glu His Glu Leu Thr Phe Asp Pro Gin Arg Gin 
120 " " 125 130 

ACT GTG CAC TTT CAG ATC AGC TCG GAG CTG CAG TTC TCA CCT GAG GAA 54 6 

Thr Val His Phe Gin He Ser Ser Gin Leu Gin Phe Ser Pro Glu Glu 
45 135 140 145 

GTG TTG GGC ATG GTT CTC AAT TAT TCT CGT TCT CTA GCT GAA GAT TTT 5 94 

Val Leu Gly Met Val Leu Asn Tyr Ser Arg Ser Leu Ala Glu Asp Phe 
150 155 160 

GCA GAG CAG CCC ATC AAG GAT GCA GTG ATC ACC GTG CCA GTC TTC TTC 64 2 

Ala Glu Gin Pro He Lys Asp Ala Val He Thr Val Fro Val Phe Phe 
165 170 175 18C 
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AAC CAG GCC GAG CGC CGA GCT GTG CTG CAG GCT GCT CGT AI'G GOT GGC 630 
Asn Gin Ala Glu Arg Arg Ala Val Leu Gin Ala Ala Arg Met Ala Gly 
185 19 0 195 

CTC AAA GTG CTG CAG CTC ATC AAT GAC AAC ACC GCC ACT GCC CTC AGC 73 8 

Leu Lys Val Leu Gin Leu He Asn Asp Asn Thr Ala Thr Ala Leu Ser 
2C0 205 210 

TAT GGT GTC TTC CGC CGG AAA GAT ATT AAC ACC ACT GCC CAG AAT ATC 786 
Tyr Gly Val Phe Arg Arg Lys Asp He Asn Thr Thr Ala Gin Asn He 
215 220 225 

ATG TTC TAT GAC ATG GGC TCA GGC AGC ACC GTA TGC ACC ATT GTG ACC 834 
Met Phe Tyr Asp Met Gly Ser Gly Ser Thr Val Cys Thr He Val Thr 
230 235 240 

TAC CAG ATG GTG AAG ACT AAG GAA GCT GGG ATG CAG CCA CAG CTG CAG B82 
Tyr Gin Met Val Lys Thr Lys Glu Ala Gly Met Gin Pro Gin Leu Gin 
245 250 255 260 

ATC CGG GGA GTA GGA TTT GAC CGT ACC CTG GGG GGC CTG GAG ATG GAG 93 0 

He Arg Gly Val Gly Phe Asp Arg Thr Leu Gly Gly Leu Glu Met Glu 
265 270 275 

CTC CGG CTT CGA GAA CGC CTG GCT GGG CTT TTC AAT GAG CAG CGC AAG 97 8 

Leu Arg Leu Arg Glu Arg Leu Ala Gly Leu Phe Asn Glu Gin Arg Lys 
280 285 290 

GGT CAG AGA GCA AAG GAT GTG CGG GAG AAC CCG CGT GCC ATG GCC AAG 102 6 

Gly Gin Arg Ala Lys Asp Val Arg Glu Asn Pro Arg Ala Met Ala Lys 
295 300 305 

30 CTG CTG CGT GAG GCT AAT CGG CTC AAA ACC GTC CTC AGT GCC AAC GCT 107 4 

Leu Leu Arg Glu Ala Asn Arg Leu Lys Thr Val Leu Ser Ala Asn Ala 
310 ~ 315 320 

GAC CAC ATG GCA CAG ATT GAA GGC CTG ATG GAT GAT GTG GAC TTC AAG 112 2 

Asp His Met Ala Gin He Glu Gly Leu Met Asp Asp Val Asp Phe Lys 
35 325 330 335 340 

GCA AAA GTG ACT CGT GTG GAA TTT GAG GAG TTG TGT GCA GAC TTG TTT 1170 
Ala Lys Val Thr Arg Val Glu Phe Glu Glu Leu Cys Ala Asp Leu Phe 
345 350 355 



2C 



25 



40 



45 



GAG CGG GTG CCT GGG CCT GTA CAG CAG GCC CTC CAG AGT GCC GAA ATG 1218 
Glu Arg Val Pro Gly Pro Val Gin Gin Ala Leu Gin Ser Ala Glu Met 
360 365 370 

AGT CTG GAT GAG ATT GAG CAG GTG ATC CTG GTG GGT GGG GCC ACT CGG 12 6 6 

Ser Leu Asp Glu He Glu Gin Val He Leu Val Gly Gly Ala Thr Arg 
375 380 385 

GTC CCC AGA GTT CAG GAG GTG CTG CTG AAG GCC GTG GGC AAG GAG GAG 1314 
Val Pro Arg Val Gin Glu Val Leu Leu Lys Ala Val Gly Lys Glu Glu 
390 395 400 

CTG GGG AAG AAC ATC AAT GCA GAT GAA GCA GCC GCC ATG GGG GCA GTG 13 62 

Leu Gly Lys Asn He Asn Ala Asp Glu Ala Ala Ala Met Gly Ala Val 
405 ' 410 415 420 
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TAC CAG GCA GCT GCG CTC AGO AAA GCC TTT AAA GTG AkG CLA Wf GTC 14 lu 

Tyr Gin Ala Ala Ala Leu Ser Lys Ala Phe Lys Val Lys Pro Phe Val 
425 430 435 

GTC CGA GAT GCA GTG GTC TAC CCC ATC CTG GTG GAG TTC ACG AGG GAG 14 58 

Val Arg Asp Ala Val Val Tyr Pro lie Leu Val Glu Phe Thr Arg Glu 
440 445 450 

GTG GAG GAG GAG CCT GGG ATT CAC AGC CTG AAG CAC AAT AAA CGG GTA 15C6 
Val Glu Glu Glu Pro Gly lie His Ser Leu Lys His Asn Lys Arg Val 
455 460 465 

CTC TTC TCT CGG ATG GGG CCC TAC CCT CAA CGC AAA GTC ATC ACC TTT 1554 
Leu Phe Ser Arg Met Gly Pro Tyr Pro Gin Arg Lys Val lie Thr Phe 
470 475 480 

AAC CGC TAC AGC CAT GAT TTC AAC TTC CAC ATC AAC TAC GGC GAC CTG 1602 
Asn Arg Tyr Ser His Asp Phe Asn Phe His He Asn Tyr Gly Asp Leu 
485 490 495 500 

GGC TTC CTG GGG CCT GAA GAT CTT CGG GTA TTT GGC TCC CAG AAT CTG 1650 
Gly Phe Leu Gly Pro Glu Asp Leu Arg Val Phe Gly Ser Gin Asn Leu 
505 510 515 

ACC ACA GTG AAG CTA AAA GGG GTG GGT GAC AGC TTC AAG AAG TAT CCT 1698 
Thr Thr Val Lys Leu Lys Gly Val Gly Asp Ser Phe Lys Lys Tyr Pro 
520 525 530 

GAC TAC GAG TCC AAG GGC ATC AAG GCT CAC TTC AAC CTG GAT GAG AGT 174 6 

Asp Tyr Glu Ser Lys Gly He Lys Ala His Phe Asn Leu Asp Glu Ser 
535 540 545 

30 GGC GTG CTC AGT CTA GAC AGG GTG GAG TCT GTA TTT GAG ACA CTG GTA 17 94 

Gly Val Leu Ser Leu Asp Arg Val Glu Ser Val Phe Glu Thr Leu Val 
550 555 560 

GAG GAC AGC GCA GAA GAG GAA TCT ACT CTC ACC AAA CTT GGC AAC ACC 18 42 

Glu Asp Ser Ala Glu Glu Glu Ser Thr Leu Thr Lys Leu Gly Asn Thr 
35 565 570 575 580 

ATT TCC AGC CTG TTT GGA GGC GGT ACC ACA CCA GAT GCC AAG GAG AAT 18 90 

He Ser Ser Leu Phe Gly Gly Gly Thr Thr Pro Asp Ala Lys Glu Asn 
585 590 595 

40 

GGT ACT GAT ACT GTC CAG GAG GAA GAG GAG AGC CCT GCA GAG GGG AGC 193 8 

Gly Thr Asp Thr Val Gin Glu Glu Glu Glu Ser Pro Ala Glu Gly Ser 
600 '"" " 60S 610 

AAG GAC GAG CCT GGG GAG CAG GTG GAG CTC AAG GAG GAA GCT GAG GCC 19 86 

45 Lys Asp Glu Pro Gly Glu Gin Val Glu Leu Lys Glu Glu Ala Glu Ala 

615 620 625 

CCA GTG GAG GAT GGC TCT CAG CCC CCA CCC CCT GAA CCT AAG GGA GAT 2 0 34 

Pro Val Glu Asp Gly Ser Gin Pro Pro Pro Pro Glu Pro Lys Gly Asp 
630 635 640 

EC 

GCA ACC CCT GAG GGA GAA AAG GCC ACA GAA AAA GAA AAT GGG GAC AAG 2C82 

Ala Thr Pro Glu Gly Glu Lys Ala Thr Glu Lys Glu Asn Gly Asp Lys 

645 650 655 660 
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TCT GAG GCC CAG AAA CCA AGT GAG AAG GCA GAG GC« GGG CC1 GAG GGC 2 13 C 

Ser Glu Ala Gin Lys Pro Ser GIu Lys Ala GIu Ala Gly Pro Glu Gly 
665 670 675 

GTC GCT CCA GCC CCA GAG GGA GAG AAG AAG CAG AAG CCC GCC AGG AAG 217S 
Val Ala Pro Ala Pro Glu Gly Glu Lys Lys Gin Lys Pro Ala Arg Lys 
680 685 690 

CGG CGA ATG GTA GAG GAG ATC GGG GTG GAG CTG GTT GTT CTG GAC CTG 2226 
JC Arg Arg Met Val Glu Glu lie Gly Val Glu Leu Val Val Leu Asp Leu 

695 700 705 

CCT GAC TTG CCA GAG GAT AAG CTG GCT CAG TCG GTG CAG AAA CTT CAG 22 74 

Pro Asp Leu Pro Glu Asp Lys Leu Ala Gin Ser Val Gin Lys Leu Gin 
710 715 720 

15 

GAC TTG ACA CTC CGA GAC CTG GAG AAG CAG GAA CGG GAA AAA GCT GCC 23 2 2 

Asp Leu Thr Leu Arg Asp Leu Glu Lys Gin Glu Arg Glu Lys Ala Ala 

725 730 735 740 

AAC AGC TTG GAA GCG TTC ATA TTT GAG ACC CAG GAC AAG CTG TAC CAG 2 3 70 

20 Asn Ser Leu Glu Ala Phe He Phe Glu Thr Gin Asp Lys Leu Tyr Gin 

745 750 755 

CCC GAG TAC CAG GAA GTG TCC ACA GAG GAG CAG CGT GAG GAG ATC TCT 2418 
Pro Glu Tyr Gin Glu Val Ser Thr Glu Glu Gin Arg Glu Glu lie Ser 
760 765 770 



25 
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GGG AAG CTC AGC GCC GCA TCC ACC TGG CTG GAG GAT GAG GGT GTT GGA 24 66 

Gly Lys Leu Ser Ala Ala Ser Thr Trp Leu Glu Asp Glu Gly Val Gly 
775 780 785 

GCC ACC ACA GTG ATG TTG AAG GAG AAG CTG GCT GAG CTG AGG AAG CTG 2 514 

Ala Thr Thr Val Met Leu Lys Glu Lys Leu Ala Glu Leu Arg Lys Leu 
790 795 800 

TGC CAA GGG CTG TTT TTT CGG GTA GAG GAG CGC AAG AAG TGG CCC GAA 2 56 2 

Cys Gin Gly Leu Phe Phe Arg Val Glu Glu Arg Lys Lys Trp Pro Glu 
805 810 815 820 

CGG CTG TCT GCC CTC GAT AAT CTC CTC AAC CAT TCC AGC ATG TTC CTC 261C 
Arg Leu Ser Ala Leu Asp Asn Leu Leu Asn His Ser Ser Met Phe Leu 
825 830 835 

AAG GGG GCC CGG CTC ATC CCA GAG ATG GAC CAG ATC TTC ACT GAG GTG 2658 
Lys Gly Ala Arg Leu lie Pro Glu Met Asp Gin He Phe Thr Glu Val 
840 " * 845 850 

GAG ATG ACA ACG TTA GAG AAA GTC ATC AAT GAG ACC TGG GCC TGG AAG 2 7 06 

Glu Met Thr Thr Leu Glu Lys Val He Asn Glu Thr Trp Ala Trp Lys 
855 860 865 

AAT GCA ACT CTG GCC GAG CAG GCT AAG CTG CCC GCC ACA GAG AAG CCT 2 754 

Asn Ala Thr Leu Ala Glu Gin Ala Lys Leu Pro Ala Thr Glu Lys Pro 
870 875 880 

GTG TTG CTC TCA AAA GAC ATT GAA GCT AAG ATG ATG GCC CTG GAC CGA 2 8 02 

Val Leu Leu Ser Lys Asp He Glu Ala Lys Met Met Ala Leu Asp Arg 
885 BS0 895 9C0 
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GAG GTG CAG TAT CTG CTC AAT AAG GCC AAG TTT ACC AAG CCC CGG CCC 285 C 

Glu Val Gin Tyr Leu Leu A sr. Lys Ala Lys Phe Thr Lys Pro Arc Pro 
SO 5 9 1 C 915 

CGG CCT AAG GAC AAG AAT GGG ACC CGG GCA GAG CCA CCC CTC AAT GCC 2 8 98 

Arg Pro Lys Asp Lys Asr. Giy Thr Arg Ala Glu Pre Pro Leu Asn Ala 
920 925 930 

AGT GCC AGT GAC CAG GGG GAG AAG GTC ATC CCT CCA GCA GGC CAG ACT 2 94 6 

Ser Ala Ser Asp Gin Gly Glu Lys Val lie Pro Pro Ala Gly Gin Thr 
935 940 945 

GAA GAT GCA GAG CCC ATT TCA GAA CCT GAG AAA GTA GAG ACT GGA TCC 2 994 

Glu Asp Ala Glu Pro He Ser Glu Pro Glu Lys Val Glu Thr Gly Ser 
950 955 960 

GAG CCA GGA GAC ACT GAG CCT TTG GAG TTA GGA GGT CCT GGA GCA GAA 3 04 2 

Glu Pro Gly Asp Thr Glu Pro Leu Glu Leu Gly Gly Pro Gly Ala Glu 
965 970 975 980 

CCT GAA CAG AAA GAA CAA TCG ACA GGA CAG AAG CGG CCT TTG AAG AAC 3 090 

Pro Glu Gin Lys Glu Gin Ser Thr Gly Gin Lys Arg Pro Leu Lys Asn 
985 990 995 

GAC GAA CTA TAACCCCCAC CTCTGTTTTC CCCATTCATC TCCACCCCCT 313 S 

Asp Glu Leu 

25 

TCCCCCACCA CTTCTATTTA TTTAACATCG AGGGTTGGGG GAGGGGTTGG TCCTGCCCTC 319 9 

GGCTGGAGTT CCTTTCTCAC CCCTGTGATT TGGAGGTGTG GAG AAG GGG A AGGGAGGG AC 32 5 9 

3C AGCTCACTGG TTCCTTCTGC AGTACCTCTG TGGTTAAAAA TGGAAACTGT TCTCCTCCCC 3319 

AGCCCCACTC CCTGTTCCCT ACCCATATAG GCCCTAAATT TGGGAAAAAT CACTATTAAT 33 7 9 

TTCTGAATCC TTTGCCTGTG GGTAGGAAGA GAATGGCTGC CAGTGGCTGA TGGGTCCCGG 34 3 9 

35 TGATGGGAAG GGTATCAGGT TGCTGGGGAG TTTCCACTCT TCTCTGGTGA TTGTTCCTTC 34 99 

CCTCCCTTCC TCTCCCACCA TGCGATGAGC ATCCTTTCAG GCCAGTGTCT GCAGAGCCTC 3 559 

AGTTACCAGG TTTGGTTTCT GAGTGCCTAT CTGTGCTCTT TCCTCCCTCT GCGGGCTTCT 3 619 

40 CTTGCTCTGA GCCTCCCTTC CCCATTCCCA TGCAGCTCCT TTCCCCCTGG GTTTCCTTGG 3679 

CTTCCTGCAG CAAATTGGGC AGTTCTCTGC CCCTTGCCTA AAAGCCTGTA CCTCTGGATT 373 9 

G G CGG AAG T A AATCTGGAAG GATTCTCACT CGTATTTCCC ACCCCTAGTG GCCAGAGGAG 3 799 

45 

GGAGGGGCAC AG TG AAG AAG GGAGCCCACC ACCTCTCCGA AGAGGAAAGC CACGTAGAGT 38 5 9 

GGTTGGCATG GGGTG CCAGC ATCGTGCAAG CTCTGTCATA ATCTGCATCT TCCCAGCAGC 3 919 

CTGGTACCCC AGGTTCCTGT AACTCCCTGC CTCCTCCTCT CTTCTGCTGT TCTGCTCCTC 3 979 

50 

CCAGACAGAG CCTTTCCCTC ACCCCCTGAC CCCCTGGGCT GACCAAAATG TGCTTTCTAC 4 03 9 

TGTGAGTCCC TATCCCAAGA TCCTGGGGAA AGGAGAGACC ATGGTGTGAA TGTAGAGATG 4 099 
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CCACCTCCCT CTCTCT3A3G CAGGCCTGTG GATGAAGGAG GAGG3TCAGG GCTGGCCTTC 415 3 

C7CTGTGCA7 CACTCTGCTA GGTTGGGGGC CCCCGACCCA CCATACCTAC GCCTAGGGAG 4219 

CCCGTCCTCC AGTATTCCGT CTGTAGCAGG AGCTAGGGCT GCTGCCTCAG CTCCAAGACA 42 7 9 

AGAATGAACC TGGCTGTTGC AGTCATTTTG TCTTTTCCTT TTTTTTTTTT TGCCACATTG 43 3 9 

GCAGAGATGG GACCTAAGGG TCCCACCCCT CACCCCACCC CCACCTCTTC TGTATGTTTG 4 39 9 

AATTCTTTCA GTAGCTGTTG ATGCTGGTTG GACAGGTTTG AGTCAAATTG TACTTTGCTC 44 5 9 

CATTG TTAAT TGAGAAACTG TTTCAATAAA ATATTCTTTT CTAC 4 503 

(2) INFORMATION FOR SEQ ID NO : 3: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 999 amino acids 
(Bj TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3: 

Met Ala Ala Thr Val Arg Arg Gin Arg Pro Arg Arg Leu Leu Cys Trp 
15 10 15 

Ala Leu Val Ala Val Leu Leu Ala Asp Leu Leu Ala Leu Ser Asp Thr 
20 25 30 

Leu Ala Val Met Ser Val Asp Leu Gly Ser Glu Ser Met Lys Val Ala 
35 40 45 

lie Val Lys Pro Gly Val Pro Met Glu lie Val Leu Asn Lys Glu Ser 
50 55 60 

Arg Arg Lys Thr Pro Val Thr Val Thr Leu Lys Glu Asn Glu Arg Phe 
65 70 75 80 

Leu Gly Asp Ser Ala Ala Gly Met Ala lie Lys Asn Pro Lys Ala Thr 

85 90 95 

Leu Arg Tyr Phe Gin His Leu Leu Gly Lys Gin Ala Asp Asn Pro His 
10C 105 110 

Val Ala Leu Tyr Arg Ser Arg Phe Pro Glu His Glu Leu Asn Val Asp 
115 " 120 125 

Pro Gin Arg Gin Thr Val Arg Phe Gin lie Ser Pro Gin Leu Gin Phe 
130 135 140 

Ser Fro Glu Glu Val Leu Gly Met Val Leu Asn Tyr Ser Arg Ser Leu 
145 150 155 160 

Ala Glu Asp Phe Ala Glu Gin Pro He Lys Asp Ala Val He Thr Val 
165 170 175 

Pro Ala Phe Phe Asn Gin Ala Glu Arg Arg Ala Val Leu Gin Ala Ala 
180 185 190 
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Arg Met: Ala Gly Leu Lys Val Leu Gin Leu He Asn Asp Asr. Thr Ala 
195 20C 2 2 5 

Thr Ala Leu Ser Tyr Gly Val Phe Arg Arg Lys Asp He Asr. Ser Thr 
210 215 220 

Ala Gin Asn He Met Phe Tyr Asp Met Gly Ser Gly Ser Thr Val Cys 
225 230 235 24C 

Thr He Val Thr Tyr Gin Thr Val Lys Thr Lys Glu Ala Gly Thr Gin 
245 250 255 

Pro Gin Leu Gin He Arg Gly Val Gly Phe Asp Arg Thr Leu Gly Gly 
260 265 270 

Leu Glu Met Glu Leu Arg Leu Arg Glu His Leu Ala Lys Leu Phe Asn 
275 280 285 

Glu Gin Arg Lys Gly Gin Lys Ala Lys Asp Val Arg Glu Asn Pro Arg 
290 295 300 

Ala Met Ala Lys Leu Leu Arg Glu Ala Asn Arg Leu Lys Thr Val Leu 
305 310 315 320 

Ser Ala Asn Ala Asp His Met Ala Gin He Glu Gly Leu Met Asp Asp 
325 330 335 

Val Asp Phe Lys Ala Lys Val Thr Arg Val Glu Phe Glu Glu Leu Cys 
340 345 350 

Ala Asp Leu Phe Asp Arg Val Pro Gly Pro Val Gin Gin Ala Leu Gin 
355 360 365 

Ser Ala Glu Met Ser Leu Asp Gin He Glu Gin Val He Leu Val Gly 
370 375 380 

Gly Pro Thr Arg Val Pro Lys Val Gin Glu Val Leu Leu Lys Pro Val 
385 390 395 40C 

Gly Lys Glu Glu Leu Gly Lys Asn He Asn Ala Asp Glu Ala Ala Ala 
405 * 410 415 

Met Gly Ala Val Tyr Gin Ala Ala Ala Leu Ser Lys Ala Phe Lys Val 
420 425 430 

Lys Pro Phe Val Val Arg Asp Ala Val He Tyr Pro He Leu Val Glu 
435 ' 440 445 

Phe Thr Arg Glu Val Glu Glu Glu Pro Gly Leu Arg Ser Leu Lys His 
450 455 460 

Asn Lys Arg Val Leu Phe Ser Arg Met Gly Pro Tyr Pro Gin Arg Lys 
465 ^ 470 475 480 

Val He Thr Phe Asn Arg Tyr Ser His Asp Phe Asn Phe His He Asn 
485 490 495 

Tyr Gly Asp Leu Gly Phe Leu Gly Pro Glu Asp Leu Arg Val Phe Gly 
* ~ 50 0 505 SIC 
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Ser Gin Asn Leu Thr Thr Val Lys Leu Lys Gly Val Gly Glu Ser Phe 
515 520 525 

Lys Lys Tyr Fro Asp Tyr Glu Ser Lys Gly lie Lys Ala His Phe Asr. 
530 535 540 

Leu Asp Glu Ser Gly Val Leu Ser Leu Asp Arc Val Glu Ser Val Phe 
545 550 555 560 

Glu Thr Leu Val Glu Asp Ser Pro Glu Glu Glu Ser Thr Leu Thr Lys 
565 570 575 

Leu Gly Asn Thr lie Ser Ser Leu Phe Gly Gly Gly Thr Ser Ser Asp 
580 585 590 

Ala Lys Glu Asn Gly Thr Asp Ala Val Gin Glu Glu Glu Glu Ser Pro 
595 600 605 

Ala Glu Gly Ser Lys Asp Glu Pro Ala Glu Gin Gly Glu Leu Lys Glu 
610 615 620 



20 



Glu Ala Glu Ala Pro Met Glu Asp Thr Ser Gin Pro Pro Pro Ser Glu 
625 630 635 640 



Pro Lys Gly Asp Ala Ala Arg Glu Gly Glu Thr Pro Asp Glu Lys Glu 
645 650 655 



25 



Ser Gly Asp Lys Ser Glu Ala Gin Lys Pro Asn Glu Lys Gly Gin Ala 
660 665 670 



30 



Gly Pro Glu Gly Val Pro Pro Ala Pro Glu Glu Glu Lys Lys Gin Lys 
675 680 685 

Pro Ala Arg Lys Gin Lys Met Val Glu Glu lie Gly Val Glu Leu Ala 
690 695 700 



35 



Val Leu Asp Leu Pro Asp Leu Pro Glu Asp Glu Leu Ala His Ser Val 
705 710 715 720 

Gin Lys Leu Glu Asp Leu Thr Leu Arg Asp Leu Glu Lys Gin Glu Arg 
725 730 735 



Glu Lys Ala Ala Asn Ser Leu Glu Ala Phe lie Phe Glu Thr Gin Asp 

74 0 74 5 75 0 

Lys Leu Tyr Gin Pro Glu Tyr Gin Glu Val Ser Thr Glu Glu Gin Arg 

755 " 760 765 

Glu Glu lie Ser Gly Lys Leu Ser Ala Thr Ser Thr Trp Leu Glu Asp 

770 775 780 

Glu Gly Phe Gly Ala Thr Thr Val Met Leu Lys Asp Lys Leu Ala Glu 

785 790 795 SCO 



50 



Leu Arg Lys Leu Cys Gin Gly Leu Phe Phe Arg Val Glu Glu Arg Arg 
805 810 815 

Lys Trp Pro Glu Arg Leu Ser Ala Leu Asp Asn Leu Leu Asn His Ser 
620 625 830 
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Ser lie Phe Leu Lys Gly Ala Arg Leu lie Pro Glu Met Asp Gin He 
B35 840 845 

Phe Thr Asp Val Glu Met Thr Thr Leu Glu Lys Val lie Asn Asp Thr 
850 855 860 

Trp Thr Trp Lys Asn Ala Thr Leu Ala Glu Gin Ala Lys Leu Pro Ala 
855 870 875 890 

Thr Glu Lys Pro Val Leu Leu Ser Lys Asp lie Glu Ala Lys Met Met 
885 890 895 

Ala Leu Asp Arg Glu Val Gin Tyr Leu Leu Asn Lys Ala Lys Phe Thr 
900 905 910 

Lys Pro Arg Pro Arg Pro Lys Asp Lys Asn Gly Thr Arg Thr Glu Pro 
915 920 925 

Pro Leu Asn Ala Ser Ala Gly Asp Gin Glu Glu Lys Val He Pro Pro 
330 935 940 

Thr Gly Gin Thr Glu Glu Ala Lys Ala He Leu Glu Pro Asp Lys Glu 
945 950 955 960 

Gly Leu Gly Thr Glu Ala Ala Asp Ser Glu Pro Leu Glu Leu Gly Gly 
25 965 970 975 

Pro Gly Ala Glu Ser Glu Gin Ala Glu Gin Thr Ala Gly Gin Lys Arg 
980 985 990 

30 Pro Leu Lys Asn Asp Glu Leu 

995 

(2) INFORMATION FOR SEQ ID NO : 4: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH : 3252 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



20 



4C 



45 



(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE : 

(A) NAME / KEY : CDS ' 

(B) LOCATION: 2CH . . 3199 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 4: 

TGAGGATGGA GCAGCGGTCG GGCCGCGGCT CCTAGGGGAG GCAGCGTGCT AGCTTCGGGG 6 0 

GCGGGCCAGT AGCGGGAGCG AGGGCCGTAC GGACACCGGT CCCTTCGGCC TTGAAGTTCA 12 0 

GGCGCTGAGC TGCCCCCTCG CGCTCGGGGT GGG CCGGAAT CCATTTCTGG GAGTGGGATC 18 0 
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TTCCACCTTC ATCAGGGTCA CA ATG GCA GCT ACA GTA AGG AGG CAG AGO CC~ 

Me: Ala Ala Thr Val Arg Arg Gin Arg Pre 
5 10 

AGG AGG CTA CTC TGT TGG GCC TTG GTG GCT GTC CTC TTG GCA GAC CTG 
Arg Arg Leu Leu Cys Trp Ala Leu Val Ala Val Leu Leu Ala Asp Leu 
15 20 25 

TTG GCA CTG AGT GAC ACA CTG GCT GTG ATG TCT GTG GAC CTG GGC AGT 
Leu Ala Leu Ser Asp Thr Leu Ala Val Met Ser Val Asp Leu Gly Ser 
30 35 40 

GAA TCC ATG AAG GTG GCC ATT GTC AAG CCT GGA GTG CCC ATG GAG ATT 
Glu Ser Met Lys Val Ala lie Val Lys Pro Gly Val Pro Met Glu lie 
45 50 55 

GTA TTG AAC AAG GAA TCT CGG AGG AAA ACT CCG GTG ACT GTG ACC TTG 
Val Leu Asn Lys Glu Ser Arg Arg Lys Thr Pro Val Thr Val Thr Leu 
60 65 70 

AAG GAA AAC GAA AGG TTT CTA GGT GAC AGT GCA GCT GGC ATG GCC ATC 
Lys Glu Asn Glu Arg Phe Leu Gly Asp Ser Ala Ala Gly Met Ala lie 
75 80 85 90 

AAG AAC CCA AAG GCT ACG CTC CGT TAT TTC CAG CAC CTC CTT GGA AAG 
Lys Asn Pro Lys Ala Thr Leu Arg Tyr Phe Gin His Leu Leu Gly Lys 
95 100 105 

CAG GCA GAT AAC CCT CAT GTG GCT CTT TAC CGG TCC CGT TTC CCA GAA 
Gin Ala Asp Asn Pro His Val Ala Leu Tyr Arg Ser Arg Phe Pro Glu 
110 115 120 

CAT GAG CTC AAT GTT GAC CCA CAG AGG CAG ACT GTG CGC TTC CAG ATC 
His Glu Leu Asn Val Asp Pro Gin Arg Gin Thr Val Arg Phe Gin lie 
125 130 135 

AGT CCG CAG CTG CAG TTC TCT CCC GAG GAG GTG CTG GGC ATG GTT CTC 
Ser Pro Gin Leu Gin Phe Ser Pro Glu Glu Val Leu Gly Met Val Leu 
140 145 150 

AAC TAC TCC CGT TCC CTG GCT GAA GAT TTT GCA GAA CAA CCT ATT AAG 
Asn Tyr Ser Arg Ser Leu Ala Glu Asp Phe Ala Glu Gin Pro He Lys 
155 160 165 170 

GAT GCA GTG ATC ACC GTG CCA GCC TTT TTC AAC CAG GCC GAG CGC CGA 
Asp Ala Val He Thr Val Pro Ala Phe Phe Asn Gin Ala Glu Arg Arg 
175 180 185 

GCT GTG CTG CAG GCT GCT CGT ATG GCT GGC CTC AAG GTG CTG CAG CTC 
Ala Val Leu Gin Ala Ala Arg Met Ala Gly Leu Lys Val Leu Gin Leu 
190 195 200 

ATC AAT GAC AAC ACT GCC ACA GCC CTC AGC TAT GGT GTC TTC CGC CGG 
He Asn Asp Asn Thr Ala Thr Ala Leu Ser Tyr Gly Val Phe Arg Arg 
205 210 215 

AAA GAT ATC AAT TCC ACT GCA CAG AAT ATC ATG TTC TAT GAC ATG GGC 
Lys Asp He Asn Ser Thr Ala Gin Asn He Met Phe Tyr Asp Met Gly 
220 225 230 
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TCG GGC AGC ACT GTG TGT ACC ATC GTG ACC TAC CAA ACG GTG AAG AC! 9J2 

Ser Gly Ser Thr Vai Cys Thr lie Val Thr Tyr Gin Thr Val Lys Thr 
235 240 245 25 0 

5 

AAG GAG GCT GGG ACG CAG CCA CAG CTA CAG ATC CGG GGC GTG GGA TTT 10 0 0 

Lys Glu Ala Gly Thr Gin Pro Gin Leu Gin lie Arg Gly Val Gly Phe 
255 260 265 

GAC CGC ACC CTG GGT GGC CTG GAG ATG GAG CTT CGG CTG CGA GAG CAC 104 9 

7C Asp Arg Thr Leu Gly Gly Leu Glu Met Glu Leu Arg Leu Arg Glu His 

270 275 280 

CTG GCT AAG CTC TTC AAT GAG CAG CGC AAG GGC CAG AAA GCC AAG GAT 10S6 
Leu Ala Lys Leu Phe Asn Glu Gin Arg Lys Gly Gin Lys Ala Lys Asp 
285 290 295 
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GTT CGG GAA AAC CCC CGA GCC ATG GCC AAA CTG CTT CGG GAA GCC AAT 114 4 

Val Arg Glu Asn Pro Arg Ala Met Ala Lys Leu Leu Arg Glu Ala Asn 
300 305 310 

CGG CTT AAA ACC GTC CTG AGT GCC AAT GCT GAT CAC ATG GGA CAG ATT 1192 
Arg Leu Lys Thr Val Leu Ser Ala Asn Ala Asp His Met Ala Gin lie 
315 320 325 330 

GAA GGC TTG ATG GAC GAT GTG GAC TTC AAG GCA AAA GTA ACT CGA GTG 124 0 

Glu Gly Leu Met Asp Asp Val Asp Phe Lys Ala Lys Val Thr Arg Val 
335 340 345 

GAG TTT GAG GAG CTG TGT GCA GAT TTG TTT GAT CGA GTG CCT GGG CCT 128 8 

Glu Phe Glu Glu Leu Cys Ala Asp Leu Phe Asp Arg Val Pro Gly Pro 
350 355 360 

GTA CAG CAG GCC CTG CAG AGT GCT GAG ATG AGC CTG GAT CAA ATT GAG 13 3 6 

Val Gin Gin Ala Leu Gin Ser Ala Glu Met Ser Leu Asp Gin lie Glu 
365 370 375 

CAG GTG ATC CTG GTG GGT GGG CCC ACT CGT GTT CCC AAA GTT CAA GAG 13 84 

Gin Val lie Leu Val Gly Gly Pro Thr Arg Val Pro Lys Val Gin Glu 
380 385 390 

GTG CTG CTG AAG CCT GTG GGC AAG GAG GAA CTA GGA AAG AAC ATC AAT 14 32 

Val Leu Leu Lys Pro Val Gly Lys Glu Glu Leu Gly Lys Asn lie Asn 
395 400 405 410 

GCC GAT GAA GCA GCT GCC ATG GGG GCC GTG TAC CAG GCA GCG GCA CTG 14 B0 

Ala Asp Glu Ala Ala Ala Met Gly Ala Val Tyr Gin Ala Ala Ala Leu 
415 420 425 

AGC AAA GCC TTC AAA GTG AAG CCA TTT GTT GTG CGT GAT GCT GTT ATT 1528 
Ser Lys Ala Phe Lys Val Lys Pro Phe Val Val Arg Asp Ala Val lie 
430 435 440 

TAC CCC ATC CTG GTG GAG TTC ACA AGG GAG GTG GAG GAG GAG CCT GGG 15 76 

Tyr Pro lie Leu Val Glu Phe Thr Arg Glu Val Glu Glu Glu Pro Gly 
445 450 455 

CTT CGA AGC CTG AAG CAC AAT AAA CGT GTG CTC TTC TCC CGA ATG GGG 16 24 

Leu Arg Ser Leu Lys His Asn Lys Arg Val Leu Phe Ser Arg Met Gly 
460 465 470 
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CCC TAC CCT CA3 CGC AAA GTC ATC ACC TTT AAC CGA TAC AGC CAT GAY 
Pro Tyr Pro Gin Arg Lys Val lie Thr Phe Asr. Arg Tyr Ser His Asp 
475 480 485 490 

TTC AAC TTT CAC ATC AAC TAC GGT GAC CTG GGC TTC CTG GGG CCT GAG 
Phe Asn Phe His He Asn Tyr Gly Asp Leu Gly Phe Leu Gly Pro Glu 
495 500 505 

GAT CTT CGG GTA TTT GGC TCC CAG AAT CTG ACC ACA GTG AAA CTA AAA 
Asp Leu Arg Val Phe Gly Ser Gin Asn Leu Thr Thr Val Lys Leu Lys 
510 515 520 

GGT GTG GGA GAG AGC TTC AAG AAA TAT CCT GAC TAT GAG TCC AAA GGC 
Gly Val Gly Glu Ser Phe Lys Lys Tyr Pro Asp Tyr Glu Ser Lys Gly 
525 530 535 

ATC AAG GCC CAC TTT AAC CTA GAC GAG AGT GGA GTG CTC AGT TTA GAC 
He Lys Ala His Phe Asn Leu Asp Glu Ser Gly Val Leu Ser Leu Asp 
540 545 550 

AGG GTG GAG TCC GTA TTC GAG ACC CTG GTG GAG GAC AGC CCA GAG GAA 
Arg Val Glu Ser Val Phe Glu Thr Leu Val Glu Asp Ser Pro Glu Glu 
555 560 565 570 

GAG TCT ACT CTT ACC AAA CTT GGC AAC ACC ATT TCC AGC CTG TTT GGC 
Glu Ser Thr Leu Thr Lys Leu Gly Asn Thr He Ser Ser Leu Phe Gly 
575 580 585 

GGT GGT ACC TCA TCA GAT GCC AAA GAG AAT GGT ACT GAT GCT GTA CAG 
Gly Gly Thr Ser Ser Asp Ala Lys Glu Asn Gly Thr Asp Ala Val Gin 
590 595 600 

GAG GAG GAG GAG AGC CCT GCT GAG GGG AGC AAG GAT GAG CCT GCA GAA 
Glu Glu Glu Glu Ser Pro Ala Glu Gly Ser Lys Asp Glu Pro Ala Glu 
605 610 615 

CAG GGG GAA CTC AAG GAG GAA GCT GAA GCC CCA ATG GAG GAT ACC TCC 
Gin Gly Glu Leu Lys Glu Glu Ala Glu Ala Pro Met Glu Asp Thr Ser 
620 625 630 

CAG CCT CCA CCC TCT GAG CCT AAG GGG GAT GCA GCC CGT GAG GGA GAA 
Gin Pro Pro Pro Ser Glu Pro Lys Gly Asp Ala Ala Arg Glu Gly Glu 
635 640 645 650 

ACA CCT GAT GAA AAA GAA AGT GGG GAC AAG TCT GAG GCC CAG AAG CCC 
Thr Pro Asp Glu Lys Glu Ser Gly Asp Lys Ser Glu Ala Gin Lys Pro 
655 ' " 660 665 

AAT GAG AAG GGG CAG GCA GGG CCT GAG GGT GTC CCT CCA GCT CCC GAG 
Asn Glu Lys Gly Gin Ala Gly Pro Glu Gly Val Pro Pro Ala Pro Glu 
670 675 680 

GAA GAA AAA AAG CAG AAA CCT GCC CGG AAG CAG AAA ATG GTG GAG GAG 
Glu Glu Lys Lys Gin Lys Pro Ala Arg Lys Gin Lys Met Val Glu Glu 
685 690 695 

ATA GGT GTG GAA CTG GCT GTC TTG GAC CTG CCA GAC TTG CCA GAG GAT 
He Gly Val Glu Leu Ala Val Leu Asp Leu Pro Asp Leu Pro Glu Asp 

700 705 710 
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GAG CTG GCC CAT TCC GTG CAG AAA CTT GAG GAC TTG ACC CTG CGA GAC 2 3*2 

Glu Leu Ala His Ser val Gin Lys Leu Giu Asd Leu Thr Leu Arg Asd 
715 7 20 725 73 0 

CTT GAA AAG CAG GAG AGG GAG AAA GCT GCC AAC AGC TTA GAA GCT TTT 244 0 

Leu Glu Lys Gin Glu Arg Glu Lys Ala Ala Asr. Ser Leu Glu Ala Phe 
735 740 745 

ATC TTT GAG ACC CAG GAC AAA CTG TAC CAA CCT GAG TAC CAG GAA GTG 24 8 8 

He Phe Glu Thr Gin Asp Lys Leu Tyr Gin Pro Glu Tyr Gin Glu Val 
750 755 760 

TCC ACT GAG GAA CAA CGG GAG GAG ATC TCT GGA AAA CTC AGT GCC ACT 2 53 6 

Ser Thr Glu Glu Gin Arg Glu Glu He Ser Gly Lys Leu Ser Ala Thr 
765 770 775 

TCT ACC TGG CTG GAG GAT GAG GGA TTT GGA GCC ACC ACT GTG ATG TTG 2 5 84 

Ser Thr Trp Leu Glu Asp Glu Gly Phe Gly Ala Thr Thr Val Met Leu 
780 785 790 

20 AAG GAC CTG GCT GAG CTG AGA AAG CTG TGC CAA GGG CTG TTT TTT 2 63 2 

Lys Asp Lys Leu Ala Glu Leu Arg Lys Leu Cys Gin Gly Leu Phe Phe 
795 800 805 810 

CGG GTG GAA GAG CGC AGG AAA TGG CCA GAG CGG CTT TCA GCT CTG GAT 26 8 0 

Arg Val Glu Glu Arg Arg Lys Trp Pro Glu Arg Leu Ser Ala Leu Asp 
25 815 820 825 

AAT CTC CTC AAT CAC TCC AGC ATT TTC CTC AAG GGT GCC CGA CTC ATC 2 72 8 

Asn Leu Leu Asn His Ser Ser He Phe Leu Lys Gly Ala Arg Leu He 
830 835 840 

30 CCA GAG ATG GAC CAG ATC TTC ACT GAC GTG GAG ATG ACA ACG TTG GAG 2 7 76 

Pro Glu Met Asp Gin lie Phe Thr Asp Val Glu Met Thr Thr Leu Glu 
845 850 855 
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AAA GTC ATC AAT GAC ACC TGG ACC TGG AAG AAT GCA ACC CTG GCC GAG 2 824 

Lys Val He Asn Asp Thr Trp Thr Trp Lys Asn Ala Thr Leu Ala Glu 
860 865 870 

CAG GCC AAG CTT CCT GCC ACA GAG AAA CCC GTG CTG CTT TCA AAA GAC 2 8 72 

Gin Ala Lys Leu Pro Ala Thr Glu Lys Pro Val Leu Leu Ser Lys Asp 
875 880 885 890 

ATC GAG GCC AAA ATG ATG GCC CTG GAC CGG GAG GTG CAG TAT CTA CTC 2 92 0 

He Glu Ala Lys Met Met Ala Leu Asp Arg Glu Val Gin Tyr Leu Leu 
895 900 905 

AAT AAG GCC AAG TTT ACT AAA CCC CGG CCA CGG CCC AAG GAC AAG AAT 2 96 8 

Asn Lys Ala Lys Phe Thr Lys Pro Arg Pro Arg Pro Lys Asp Lys Asn 
91C 915 920 

GGC ACC CGG ACA GAG CCT CCC CTC AAT GCC AGT GCT GGT GAC CAA GAG 3 016 

Gly Thr Arg Thr Glu Pro Pro Leu Asn Ala Ser Ala Gly Asp Gin Glu 
925 930 935 

GAA AAG GTC ATT CCA CCT ACA GGC CAG ACT GAA GAG GCG AAG GCC ATC 3 0 64 

Glu Lys Val He Pro Pro Thr Gly Gin Thr Glu Glu Ala Lys Ala He 
940 945 950 
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TTA GAA CCT GAC AAA GAA GGG CTT GGT AC A GAG GCA GCA GAC TCI GAG 
Leu Glu Pro Asp Lys Glu Gly Leu Gly Thr Glu Ala Ala Asp Ser GIu 
555 960 965 570 

CCT CTG GAA TTA GGA GGT CCT GGT GCA GAA TCT GAA CAG GCA GAG CAG 
Pre Leu Glu Leu Gly Gly Pro Gly Ala Glu Ser Glu Gin Ala Glu Gin 
975 980 985 

ACA GCA GGG CAG AAG CGG CCT TTG AAG AAT GAT GAG CTG TGACCCCGCG 
Thr Ala Gly Gin Lys Arg Pro Leu Lys Asn Asp Glu Leu 
990 995 

CCTCCGCTCC ACTTGCCTCC AGCCCCTTCT CCTACCACCT CTA 



(2) INFORMATION FOR SEQ ID NO : 5: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(iij MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5: 

Leu Ala Val Met Ser Val Asp Leu Gly Ser Glu Ser Met Lys Val Al 
15 10 15 

He Val Lys Pro Gly Val Pro Met Glu He Val Leu Asn Lys Glu 
20 25 30 

{2} INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc » "synthetic nucleic acid" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6: 

AATACGACTC ACTATAGGGA 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 7 amino acids 
<B ) TYPE: amino acid 
(C) STRANDEDNESS: single 
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<D; TOPOLOGY: linear 
(ii) MOLECULE TYPE : peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Lys Pro Gly Val Pro Met Glu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "synthetic nucleic acid" 



(ix) FEATURE: 

(A) NAME /KEY : - 

(B) LOCATION : 6 

(D) OTHER INFORMATION: /note =. "N at position 6 is an 
inosine residue." 

(ix) FEATURE : 

(A) NAME /KEY : - 

(B) LOCATIONS 

(D) OTHER INFORMATION: /note= "N at position 9 is an 
inosine residue." 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8: 
AAR CCNGGNG TNCCNATGGA 
(2) INFORMATION FOR SEQ ID NO : 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9: 

Lys Pro Gly Val Pro Met Glu lie Val Leu Asn Lys Glu 
15 1C 
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(2) INFORMATION FOR SEQ ID NO: 1C : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2 0 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc * "synthetic nucleic acid" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 10: 

GCACCCTTGA GGAAAATGCT 2 0 

(2) INFORMATION FOR SEQ ID NO : 11: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2 0 base pairs 
<B) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
25 (A) DESCRIPTION: /desc - "synthetic nucleic acid" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CCCAGAAGCC CAATGAGAAG 2 0 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2861 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GAAAGAAGTA GACATGGGAG ACTTCATTTT GTTCTGTACT AAGAAAAATT CTTCTGCCTT 6 0 

GGGATGCTGT TGATCTATGA CCTTACCCCC AACCCTGTGC TCTCTGAAAC ATGTGCTGTG 12 0 

TCCACTCAGG GTTAAATGGA TTAAGGGCGG TGCAAGATGT GCTTTGTTAA ACAGATGCTT 180 

GAAGGCAGCA TGCTCGTTAG GAGTCATCAC CACTCCCTAA TCTCAAGTAC CCAGGGACAC 24 0 

AAACACTGCG GAAGGCCACA GGGTCCTCTG CCTAGGAAAG CCAGAGACCT TTGTTCACTT 3 00 
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GTTTATCTGC TGACCTTCCC TCCACTATTG TCCTATGACC CTGCCAAATC CCCCTCTGCC 360 

AGAAACACCC AAGAATGATC AATAAAAAAA AAAAAAAAAA AAAAAGGAAG AATAGACTCT 42 0 

CTCTGGGACT GCCAATAATT TTTCCTTCTA AG CAT AG AC A CCGGACCACT CTCCACCTAA 48 0 

GCATCACGAA AAATGTAGAG AAAGGAAGAG CTAAGAGCTC CTTAAACAAG TTCAGGCTTG 54 0 

ACACAACCCT GGCCCTGACA GCCAGGGTCT TCAAGCGGGC CTTTCTGTGA AGGGTGGCCA 6 CO 

GGCATCAACT TAG TAGG AGA G AAAA C AG AT GACTTATTTC CATCCACACT TAAGGAAAAT 660 

GCAGTCTCCA AGGACTGCGT ACATTTCTTT TTCGAGAAGG AGTCTCGCTG TTGTCGCCCA 72 0 

GGCTGGAGTG CAGTGGCGCA GTCTGGGCTC ACAGCAACCT CTGCCTCCCG GATTCAAGCA 78 0 

ATTCTCCTGC CTCAGCCTCG TGAGTAGCTG GGATTACAGG CACCCGCCAC CACGCCTGGC 84 0 

TAATTTTTGT AGTTTTGGTA GAGACGGGGT TTCACCATGT TGGCCAGGCT GGTCTCGAAC 90 0 

TCCTGACCTC CAGTGATTCG CCCGCCTTGG CCTCCCAAAA TGCTGGGATT ACAGGCGTGA 96 0 

GCCACCGCGC CCGGGCGACT GCGCACATTT CTATGGAGCT GTAAGTTAAA AGAGAAGGCA 1020 

GTGAGGTGCT TCTGTCATTC TATGACAGAA ACAG CTAAAG AGTAGAGAAA TGTT CACAAG 1080 

ATTTAATAGA ACAGAAATAG GAGAAGGTGC ACACAAGCTC AACCAACTAT AGCCTCACAA 114 0 

ATAAAAGTGT CTTTTGTGTG TAG T ACTTAA GTTTGGAATA TTCTTTCTTA TACAAATGAG 12 0 0 

TGGGG CTT AA C CTAAGAAAT CCTGGCCAGA TTCTGCGACG AATGCATCGG TTATCTCTGA 126 0 

3C CCCATCAGCA AAC AT CTTTT TCTGTGGCTT CAGTTTCCTC AGTAAAACAG AGGGGGTTGC 13 2 0 

GACGGACTCA GTCCGAGGCA CAGCCATTCT CCAACGTCTA TCCAAAGCCT AGGGCACCTC 13 8 0 

AATACTAACC GGCAGGCCAG CGCCCCCTCC GCGGGGCTGC GGACAGGACG CCTGTTATTC 144 0 

35 CATTCCTCGG CCGGGCTCTA CAGGTGACCG GAAGAAGAGC CCCGAGTGCG GGACTGCAGT 15 00 

GCGCCCGACC TGCTCTAGGC GCAGGTCACT CCCGAACCCC GGCAGCAAAG CATCCAGCGC 156 0 

CGGAAAAGGT CCCGCGGTCG CCCCGGGGCC GGCGCTGGGG AGGAAGGAGT GGAGCGCGCT 16 2 0 

40 GGCCCCGTGA CGTGGTCCAA TCCCAGGCCG ACGCCGGCTG CTTCTGCCCA ACCGGTGGCT 16 6 0 

GGTCCCCTCC GCCGCCCCCA TTACAAGGCT GGCAAAGGGA GGGGGCGGGG CCTGGGACGT 174 0 

GGTCCAATGA GTACGCGCGC CGGGGCGGCG GGGGCGGGGC CGGGCGCGCA GCGCAGGGCC 18 0 0 

GGGCGGCCGA GGCTCCAATG AGCGCCCGCC GCGTCCGGGG CCGGCTGGTG CGCGAGACGC 186 0 

CGCCGAGAGG TTGGTGG CTA ATGTAACAGT TTGCAAACCG AGAGGAGTTG TGAAGGGCG C 192 C 

GGGTGGGGGG CGCTGCCGGC CTCGTGGGTA CGTTCGTGCC GCGTCTGTCC CAGAGCTGGG 198 0 

GCCGCAGGAG CGGAGGCAAG AGGTAGCGGG GGTGGATGGA GGTGCGGGCC GGCCACCCCT 2 04 0 

CCTAGGGGAG ACAGCGTGCG AGCTCCGGGG GCGGGTCGGG AG CGC AAGGG AGGGCC3CGC 2100 
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GGACGCCGGG CGCTCGGCCT CGCACCGGGG GGCACGCAGC TCGGCCCCCG GTCTGTCCCC 216 0 

ACTTGCTGGG GCGGGCCGGG ATCCGTTTCC GGGAGTGGGA GCCGCCGCCT TCGTCAGGTC- 2 22 0 

GGGTTTAGGT GAACACCGGG TAACGGCTAC CCGCCGGGCG GGGAACCTTA CCGCCCCTGG 22 80 

CACTGCGTCT GTGGGCACAG CGGGGCCGGG GAGTGAGCTG GGAAAGGGGA GGGGGCGGGA 2 340 

CAACCCGCAG GGATGCCGAG GAGGAGATAG GCCTTTCCTT CATCCTAGCT ACCCCCAACG 24 CC 

TCATTACCTT TCTCTTCCCG TCCAGGCCCA GCTGGCTTTC CCCGTCAGCG GGGGAGCTCC 246 0 

AGGTGTGGGG AGGTGGTTGA GCCCTGGGCG GGGATCCCTG GCCGCACCCC AGGTGTCTGA 2 52 0 

CAACAGGCAC AGTGCTGCGG TGCGCCACTC ACTGCCTGTG TGGTGGACAA AAGGCTCGGG 2 58 0 

TCTCCTTTCT CTTGTCCTGT TAGCTTCTCT GTTTAGGGAT GTGGCAAAGC CGAGGACCCA 264 0 

TGCTCTTTCA CTTGGGCCTT TGTGTGGGCG CTGCTGGGAT GATTAGAGAA TGGTTTGTAC 2 70 0 

C CAT C AGG AG GGAGAAGGGG AGAAGTAGGC TGATCTGCCC TGGGTAAGAA TGAAG TAG AT 2 76 0 

ATGAATCTTA CAGCCTCTCC GTTCTGGGAT GTGATTCTGT CTCCTTCACT CCGGGTATCC 2 82 0 

AGTTTTAAGT GTTTTCTTTC TTCGCCTCCC CCAGGGG CAC T 2 861 



Claims 

1 . A polynucleotide encoding an ORP1 50 polypeptide selected from the group consisting of: 

(a) polynucleotides encoding the polypeptide having the amino acid sequence as depicted in SEQ ID NO:l or 
a fragment of the polypeptide; 

(b) polynucleotides comprising the coding region of the nucleotide sequence as shown in SEQ ID NO:2 or a 
fragment thereof; 

(c) polynucleotides encoding the polypeptide having the amino acid sequence as depicted in SEQ ID NO:3 or 
a fragment of the polypeptide; 

(d) polynucleotides comprising the coding region of the nucleotide sequence as depicted in SEQ ID NO:4 or a 
fragment thereof; 

(e) polynucleotides encoding an ORP150 polypeptide which differs from the polypeptide encoded by the poly- 
nucleotide of (a) or (c) due to deletion(s), addition(s), insertion(s) and/or substitutions(s) of one or more amino 
acid residues; and 

(f) polynucleotides the complementary strand of which hybridizes to a polynucleotide of any one of (a) to (e) 
and which encode an ORP150 polypeptide; 

and the complementary strand of such a polynucleotide. 

2. The polynucleotide of daim 1 which is DNA. 

3. The polynucleotide of claim 2 which is genomic DNA 

4. The polynucleotide of claim 1 which is RNA. 

5. A vector comprising the polynucleotide of any ore of claims 1 to 4 
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6. The vector of claim 5, in which the polynucleotide is operatively linked to regulatory elements which allow for 
expression in prokaryotic or eukaryotic host cells. 

7. A host cell transformed and genetically engineered with a polynucleotide of any one of claims 1 to 4 or with a vector 
of claim 5 or 6. 

8. A process for the preparation of an ORP150 polypeptide comprising culturing the host cell of claim 7 and recover- 
ing the polypeptide from the cells and/or the culture medium. 

9. A polypeptide encoded by the polynucleotide of any one of claims 1 to 4 or obtainable by the process of claim 8. 

1 0. An antibody or fragment thereof which specifically recognizes the polypeptide of claim 9. 

1 1. A nucleic acid molecule which specifically hybridizes to a polynucleotide of any one of claims 1 to 4. 

12. A pharmaceutical composition comprising a polynucleotide of any one of claims 1 to 4, the polypeptide of claim 9, 
the antibody of claim 10 and/or the nucleic acid molecule of claim 1 1 and optionally a pharmaceutical ly acceptable 
carrier. 



13. A diagnostic composition comprising a polynucleotide of any one of claims 1 to 4, the polypeptide of claim 9, the 
antibody of claim 10 and/or the nucleic acid molecule of claim 1 1 . 

14. Use of the polynucleotide of any one of claims 1 to 4, the polypeptide of claim 9, the antibody of claim 10 or the 
nucleic acid molecule of claim 1 1 for the preparation of a pharmaceutical composition for the treatment of ischemic 
diseases. 



15. A nucleic acid molecule having promoter activity and being able to promote transcription in cells when exposed to 
hypoxia selected from the group consisting of: 

(a) polynucleotides comprising the nucleotide sequence as depicted in SEQ ID NO:12 or a fragment thereof; 
and 

(b) polynucleotides hybridizing with the polynucleotide of (a). 
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